DISTINGUISHING TOPICAL AND SOCIAL GROUPS BASED ON COMMON IDENTITY AND BOND THEORY
description
Transcript of DISTINGUISHING TOPICAL AND SOCIAL GROUPS BASED ON COMMON IDENTITY AND BOND THEORY
DISTINGUISHING TOPICAL AND SOCIAL GROUPS BASED ON COMMON IDENTITY
AND BOND THEORY
Przemyslaw A. GrabowiczLuca M. Aiello
Vìctor M. EguìluzAlejandro Jaimes
We built a classifier that distinguishesif a given set of people is either
a social or a topical group.
What aresocial
andtopicalgroups?
Social groups
Friends
www.news.com.au
Characteristics of social groups
• Direct reciprocity of interactions
• Small talk (broad range of topics in conversations)
Topical groups
http://www.flickr.com/photos/59571907@N03/5545401056/
A cameraclub
Characteristics of topical groups
• General reciprocity of interactions
• Conversations on a narrow range of topics
Two types of metrics
Based on reciprocity of
interactions
Based on diversity of topics (Shannon’s entropy)
Reciprocity metrics
intra-group reciprocity
intra-reciprocity:
--------------------------------------------------
inter-reciprocity:
1.
2.
Diversity of topics’ metrics
H(g) – Shannon’s entropy of terms/tags
normalized by the average for all groups having the same number of terms
1.
2.
Dataset – Flickr, 2008
Tags extracted from photos:
• from a group pool• commented in a group
• favorited in a group
Human labeling of groups
Consists of exploring:
• text of comments• group profiles
• photos• tags
• maps
Results
Normalized entropyReciprocity
Classifier
score Sg
3 tg
3 ug 3
hg AUC0.75
Accuracy0.76
1.
2.classifie
r3 tg
3 ug
3 hg
3 Hg
3 ag3 bg
size
3 Eg
AUC0.88
Accuracy0.80
hg for comments1
tg for comments2
ug for comments3
hg for favorites4
bg for comments5
AUC0.87
Accuracy0.80
ConclusionsFindings:• The metrics work as the theory predicts• Agreement and accuracy depend on value of the
score• Groups found with a community detection algorithm
are more social than declared groups
Future work:• Entropy is a simple measure, could be replaced with
something what understands texto NLP
• Binary classifier has its limitationso multi-label classification