Keep Your Friends Close and Your Facebook Friends Closer · Keep Your Friends Close and Your...
Transcript of Keep Your Friends Close and Your Facebook Friends Closer · Keep Your Friends Close and Your...
Keep Your Friends Close and Your Facebook Friends Closer:A Multiplex Network Approach to the Analysis of Offline and Online Social Ties
Desislava Hristova*, Mirco Musolesi**, Cecilia Mascolo**University of Cambridge, **University of Birmingham
Text Message
Phone Call
Co-location
Close Friends
Facebook Friends
No Relationship
“…more strongly tied pairs make use of more of the available media,
a phenomenon I have termed media multiplexity” - Caroline
Haythornwaite
α
1
2
3
n1
n2
n3
n1
n2
n3
n1
n2
n3
α1 ∪ α2 ∪ α3
α1 ∩ α2 ∩ α3
n1
n2
n3
n1
n2
0
10
20
30
40
0 5 10 15 20 25Call degree
Frequency
0
10
20
30
40 80 120 160Proximity degree
Frequency
0
10
20
30
40
50
0 2 4 6 8SMS degree
Frequency
Degree
<k> = 2
<k> = 6
<k> = 61
intersection all proximity and calls proximity only union all
0.0
0.2
0.4
0.6
none fb soc cf none fb soc cf none fb soc cf none fb soc cf
P(X=x)
Relationship
None(none)
Facebook(fb)
Socialize(soc)
Close Friends(cf)
intersection all proximity and calls proximity only union all
0.0
0.2
0.4
0.6
none fb soc cf none fb soc cf none fb soc cf none fb soc cf
P(X=x)
Relationship
None(none)
Facebook(fb)
Socialize(soc)
Close Friends(cf)
(H1); If we consider the number of layers as an indicator oftie strength, we can describe the strength of a tie in terms ofa multiplex edge weight in the network as:
mw
ij
=MX
↵=1
a
↵
ij
M
(4)
where M is the total number of layers in the multiplex net-work. For example, if two students utilize all possible chan-nels for communication (in our case M = 3), mw
ij
will beequal to 1, whereas if they will use one channel, mw
ij
willbe 1/3 . Next, we will apply this multiplex weight to assessthe relationship between multiplexity and user profile simi-larity in terms of the political, music, health and situationalfactors of the students.
H2: Multiplexity & Profile SimilarityIn order to assess whether multiplexity and profile similarityare related to one another, we compute the similarity scoresbetween students, using the cosine similarity of the vectorof attributes for each category - music, health, political, andsituational for each pair of students. These values are de-scribed in detail in Table 1. As an example, if two studentsare both somewhat interested in politics (value = 2), and oneis slightly liberal (value = 5), while the other is slightly con-servative (value = 3), their cosine similarity (sim = 0.98)would be higher than a pair of students with the same po-litical orientations but where one student is not at all inter-ested (value = 0) and the other is very interested (value =3) (sim = 0.7). Each category has a different number andrange of attributes, and the similarity scores vary accord-ingly, however the magnitude is consistent and allows us toperform graph correlation analysis, as described next.
To find the relationship between the multiplexity of a tie(mw
ij
), and its similarity, based on the political, health,music preferences and situational factors described in theDataset section, we use a graph correlation coefficient. It isderived from the comparison of the adjacency matrix of themultiplex graph (constructed from the multiplex weights onthe edges), and the pairwise similarity score matrix, using astandard matrix correlation coefficient function:
C
i
=
nPj2Ni
w
a
i,j
⇤ wb
i,j
snP
j2Ni
w
a
i,j
⇤nP
j2Ni
w
b
i,j
(5)
for the correlation coefficient per node with normalizedweights, and the correlation between two matrices as:
C(Aa
, A
b) =
nPi
C
i
N
(6)
where N is the number of nodes in the correspondent ma-trices.
The results of the correlations are presented in Table 3,where we can see that there is a significant positive relation-ship between the multiplex edge weight and the similarity
(a) political (b) health
(c) music (d) residence area/year
Figure 4: Spearman degree rank correlations (⇢ betweensimilarity and multiplex networks. (a) political ⇢ = 0.9 (b)health ⇢ = 0.85 (c) music ⇢ = 0.9 (d) residence area/year⇢ = 0.7
across categories. The highest correlations are with the po-litical and health factors (r = 0.6 for both), signifying thatthese are most closely related to tie strength but comparablyso with situational factors(r = 0.56), and music (r = 0.49).maybe clarify type of correlation?
Table 3: Graph correlations between multiplex weight andeach similarity score, p-value < 0.001 **, 0.01 *
political music health situational0.6** 0.49* 0.6** 0.56*
Now that we know that edges with high similarity also ex-hibit high multiplexity, we consider whether this holds trueon a network level. We measure the degree of each node interms of similarity and multiplexity and report the Spear-man degree rank correlation in Fig. 4. We observe that thosenodes with high similarity degree, also have a high multi-plexity degree. This signifies that students who are more so-cial are also more similar to their neighbours, and that pop-ular students are more similar to other students, than lesspopular ones.
We are able to confirm that the greater the multiplex-ity, the greater the similarity observed across factors (H2),both on the individual edge and neighbourhood levels. Ho-mophily effects, which are exhibited on the network level,are explored in the following section.
Multiplex distance: the shortest path distance
between two nodesin a network where edges
are weighted by their multiplexity
☞ Strong social ties are characterized by maximal
use of communication channels.
☞ Weak social ties are characterized by minimal
use of communication channels.
☞ A multiplex network approach helps analyze
social ties and their strength.
☞ Similarity between pairs is greater at a shorter
distance in the multiplex network.
☞ Online social circles can provide greater
diversity of information in politics and music.
Do stronger ties utilize more
communication channels?
Are people who utilize more
communication channels more
similar?
Are online social circles more diverse
than offline social circles?
0
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.00
0.25
0.50
0.75
1.00p(x|y)
00.10.20.30.40.50.60.70.80.91
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.0
0.1
0.2
0.3
0.4p(x|y)
00.20.30.40.50.60.70.80.91
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.0
0.2
0.4
0.6
p(x|y)
0
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.0
0.2
0.4
0.6
p(x|y)
Music PreferencesPolitical Orientation
Situational Factors Health Factors
Survey Data**All data is available on the MIT Reality Commons webpage, kindly provided by
the MIT Human Dynamics Laboratory
Category Parameters Range avg
Political
Health
Music
Situational
Interest in politicsPolitical orientation
Very - Not at allLiberal - Conservative
Somewhat InterestedLiberal
Weight (lb)Height (in)Salads per weekFruits per dayAerobics per week (days)Sports per week (days)
81 - 330 lb60 - 81 in0 - 6 salads0 - 7 fruits0 - 7 days0 - 6 days
158 lb67 in1.5 salads2 fruits2 days1 day
Indie/Alternative RockTechno/Lounge/ElectronicHeavy Metal/HardcoreClassic RockPop/Top 40Hip-Hop/R&BJazzClassicalCountry/FolkShow tunesOther
No - High Interest
Moderate InterestModerate InterestSlight InterestModerate InterestModerate InterestSlight InterestSlight InterestModerate InterestSlight InterestSlight InterestSlight Interest
Year in collegeResidential sector
Freshman - Graduate1 - 8
Junior2
-0.2
-0.1
0.0
0.1
0.2
health music political year/floor
Δsim
category
health
music
political
year/floor
Graph Aggregations
Multiplex Graphs
Similarity & Distance
Online Diversity
The first two figures show an overall homogeneity interms of political and health factors. At a distance of 1 (non-connected nodes), the probability of high similarity is stillhigh, indicating that two nodes with high multiplexity andhigh similarity in these categories could be connected at ran-dom. High homogeneity can be expected in the study, giventhat students are co-residing and share the same context.
On the other hand, homophily exists where there is non-random similarity between individuals with shorter multi-plex distance. This is most evident in subfigure 5c - musicpreferences, where there is a clear probability increase fromtop left to low right as distance increases. This means thatthose pairs who are closest in the network, are also highlylikely to have a similar taste in music (sim = 0.9, wheredist = 0), whereas those pairs who are further from eachother in the network have a lower similarity in music taste.For unconnected pairs, the similarity is especially low (near0), indicating that edges in the network with respect to mu-sic are non-random and highly dependent on musical prefer-ences.
Fig. 5d on the other hand shows an interesting divide be-tween low and high similarities. Most pairs are grouped intovery high or very low similarity, and appear at the top rowand bottom row of the graph. Therefore, students tend to beeither in the same year and floor or in different years anddifferent floors, which may be as a result of room allocationaccording to year. There is a high chance (P = 0.7) thatthose who live and study together (sim = 1), also have ahighly multiplex tie, represented by the distinguishable topleft tile at position (0,1), and less so for those who live fur-ther apart and/or are in a different year.
Despite the community being highly homogeneous in po-litical and health aspects, we find that the shorter the dis-tance in the multiplex network, the greater the similaritybetween two nodes (H3) in the music and situational cate-gories, indicative of homophily. The decrease of diversity ofresources such as information is a potentially harmful effectof homophily and homogeneity in a community. We there-fore conclude our analysis by exploring the role of onlineand offline social circles in diversifying information in thestudent community in the following section.
H4: Diversity in Offline & Online Social CirclesSo far, we have identified homogeneity and homophilywithin the student community. Especially during the forma-tive university years, it is important to have access to diverseinformation and opinions. In this section we assess the valueof social media such as Facebook, in introducing informa-tion diversity across the examined categories in this work bycomparing students’ offline social circles with their onlineones.
We define diversity as the opposite of homogeneity, andmeasure the � difference (change) in similarity between theonline and offline neighborhoods of each node. The onlineneighborhood of a node includes all nodes to which it isconnected where there is any declared social relationship,since we know that all relationships are a subset of Face-book (CF ⇢ SC ⇢ FB). The offline neighborhood in-cludes all other connections to nodes in the graph, where
00.20.30.40.50.60.70.80.91
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.0
0.2
0.4
0.6
p(x|y)
(a) political
0
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.00
0.25
0.50
0.75
1.00p(x|y)
(b) health
00.10.20.30.40.50.60.70.80.91
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.0
0.1
0.2
0.3
0.4p(x|y)
(c) music
0
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.6 0.7 0.8 1network distance
cosi
ne s
imila
rity
0.0
0.2
0.4
0.6
p(x|y)
(d) situational (year/floor)
Figure 5: Conditional probability of similarity given a certain net-work distance (P (x|y)) in terms of multiplex weighted shortestpath. We consider the entire multiplex networks, where the maxi-mum distance was 0.9, and 1 is where no path exists between twonodes.
there is no stated social relationship in the surveys. To ob-tain the � change between the online and offline similarity,we subtract the average online similarity of an ego from theaverage offline similarity. Formally:
�sim =
Psimoff
|Noff |�
Psim
on
|Non
| (7)
where simoff and sim
on
are the similarity values betweenthe offline and online contacts of a given student, and Noffand N
on
are his offline and online neighborhoods.In Fig. 6, values above 0 indicate that there is greater on-
line diversity than offline, whereas values below zero indi-cate that there is a lower diversity online than offline. Theboxplot can be read as the distribution of �sim, where thevalues are divided into quartiles. The top and bottom contin-uous lines (or whiskers) represent the top and bottom quar-tiles, whereas the values inside the box, below and above thethick line representing the median, are the upper and lowermid-quartiles.
We can see immediately that the situational (year/floor)category online is largely less diverse than offline. In otherwords, students are more often Facebook friends with otherstudents in their year and residential sector but communicateoffline with students from other residential sectors and/oryears. Interestingly, we find that in terms of political orienta-tion, Facebook friendship between students tends to be morevaried than offline, with more than 50% of values above
Greater diversity is possible within online social circleswith respect to the political and music categories.
Take-aways
Multiplex Network Approach
n = 74TM ⊂ PC ⊂ CL
Keep Your Friends Close and Your Facebook Friends Closer: A Multiplex Network Approach to the Analysis of Offline and Online Social Ties. Desislava Hristova, Mirco Musolesi, Cecilia Mascolo. The International AAAI Conference on Weblogs and Social Media (ICWSM) 2014, Ann Arbor.
Contact: [email protected] | www.cl.cam.ac.uk/~dh475/ | @desi_hristova