Community detection from a computational social science perspective

42
COMMUNITY DETECTION FROM A COMPUTATIONAL SOCIAL SCIENCE PERSPECTIVE Davide Bennato Università di Catania [email protected] @tecnoetica

description

This is the talk I gave at the Lipari Summer School on Computational Social Science, 2014. Which are the sociological strategies to detect communities in social media? How we can define a community form a computational social science point of view?

Transcript of Community detection from a computational social science perspective

COMMUNITY DETECTION FROM A

COMPUTATIONAL SOCIAL SCIENCE PERSPECTIVE

Davide Bennato

Università di Catania

[email protected]

@tecnoetica

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

COMMUNITY:

(brief) SOCIOLOGICAL HISTORY

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Ferdinand Tönnies: Gemeinschaft und Gesellschaft (1887) Community: groupings based on feelings of togetherness and on

mutual bonds

Society: groups that are sustained by it being instrumental for their members' individual aims and goals

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Georg Simmel: Sociability (1908)

All the forms of association by which a mere sum of separate

individuals are made into a “society” (Ritzer)

Social geometry: dyad (relation between two entities), triad (relation

between three entities)

Circles: social structure surrounding people based on a special

interest

David Armano: https://www.flickr.com/photos/7855449@N02/2779601063

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

http://www.lsu.edu/faculty/fweil/SimmelCircles.htm

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

I thus designate sociability as the play-form of sociation. Its relation to content-

determined, concrete sociation is similar to that of the work of art to reality. [...]

Sociability has no objective purpose, no content, no extrinsic results, it entirely

depends on the personalities among whom it occurs. Its aim is nothing but the

success of the sociable moment and, at most, a memory of it.

Hence the conditions and results of the process of sociability are exclusively the

persons who find themselves at a social gathering. (G. Simmel, 1908)

I thus designate sociability as the play-form of sociation. Its relation to content-

determined, concrete sociation is similar to that of the work of art to reality. [...]

Sociability has no objective purpose, no content, no extrinsic results, it entirely

depends on the personalities among whom it occurs. Its aim is nothing but the

success of the sociable moment and, at most, a memory of it.

Hence the conditions and results of the process of sociability are exclusively the

persons who find themselves at a social gathering. (G. Simmel, 1908)

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Barry Wellman: Networked individualism (2002)

A community is a network of relationship

«in practice, societies and people’s lives are often mixtures of groups

and networks»

Mark Lombardi: http://modcult.org/image/1976

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Little boxes (Wellman 2002)

Pre-industrial social relationships were based on itinerant bands, agrarian villages, trading towns, and urban neighborhoods. People walked door-to-door to visit each other in spatially compact and densely-knit milieus. If most settlements or neighborhoods contained less than a thousand people, then almost everybody would know each other. Communities were bounded, so that most relationships happened within their gates rather than across them. Much interaction stayed within neighborhoods, even in big cities and trading towns. When people visited someone, most neighbors knew who was going to see whom and what their interaction was about. Contact was essentially between households, with the awareness, sanction and control of the settlement.

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Glocalized networks (Wellman 2002)

If “community” is defined socially rather than spatially, then it is clear that contemporary communities rarely are limited to neighborhoods. They are communities of shared interest rather than communities of shared kinship or locality. People usually obtain support, companionship, information and a sense of belonging from those who do not live within the same neighborhood or even within the same metropolitan area. Many people’s work involves contact with shifting sets of people in other units, workplaces, and even other organizations. People maintain these ties through phoning, emailing, writing, driving, railroading, transiting, and flying

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Networked Individualism (Wellman 2002)

We are now experiencing another transition, from place-to-place to person-to-person

connectivity. Moving around with a mobile phone, pager, or wireless Internet makes

people less dependent on place. Because connections are to people and not to places, the

technology affords shifting of work and community ties from linking people-in-places to

linking people wherever they are. It is I-alone that is reachable wherever I am: at a house,

hotel, office, freeway or mall. The person has become the portal […] The shift to a

personalized, wireless world affords networked individualism, with each person switching

between ties and networks. People remain connected, but as individuals rather than being

rooted in the home bases of work unit and household. Individuals switch rapidly between

their social networks. Each person separately operates his networks to obtain information,

collaboration, orders, support, sociability, and a sense of belonging

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

COMMUNITY:

SOCIOLOGICAL PROPERTIES

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • People

Little number

Large number

https://www.flickr.com/photos/jbid-post/6555965015/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Relationship

Strong: stable, everyday

Weak: unstable, seldom

https://www.flickr.com/photos/bunnyrel/8290653345/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Social semantic

Content based: e.g. Football fans

Project based: e.g. activists

Relationship based: e.g. friendship, brotherhoodhttps://www.flickr.com/photos/paulisson_miura/14265444730/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Time

Permanent: e.g. family ties

Temporary: e.g. media audience

https://www.flickr.com/photos/adamendoza/3150313038/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Place

Physical: e.g. neighbourood

Digital: e.g. social network connections

Blurred: e.g. earthquake tweetshttps://www.flickr.com/photos/sierragoddess/5435989568/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

COMMUNITY:

SOCIO-COMPUTATIONAL PROPERTIES

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Concept: Modularity

“It’s a formalization of the idea that communities should contain

many connection within and few outside of the group”(Jürgens 2014)

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Concept: Clique

“A clique is a subset of points in which every possibile pair of points

is directly connected by a line and the clique is not contained in any

other clique” (Scott 2000)

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Algorithm: Girvan and Newman (Jürgens 2014)

“It’s based on “betweenness centrality”: how many shortest paths

across the network lead through one link”

“One by one the links through which the most short connections lead

are removed”

http://www.chinaz.com/web/2012/1224/286875_2.shtml

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

https://ultrabpm.wordpress.com/2013/03/25/social-network-analysis-part-two/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Algorithm: clique percolation (Jürgens 2014)

“Networks can be said to have choke points that separate two well

connected areas from each other.”

“CP finds cliques where every node is connected to every other node

and “moves” them across the network until they reach a choke point”

“As the algorithm increases the size of cliques fewer and fewer

communities exist”http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.94.160202

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

http://horicky.blogspot.it/2012/11/detecting-communities-in-social-graph.html

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

COMMUNITY: CASE STUDY

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

http://adequatebird.com/2010/05/03/the-political-blogosphere-and-the-2004-u-s-election-divided-they-blog/

Political blogosfere (Adamic, Glance 2005)

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

http://inmaps.linkedinlabs.com/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

http://blog.socialflow.com/post/5246404319/breaking-bin-laden-visualizing-the-power-of-a-single

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

Mapping Twitter Topic Networks (Pew Research Center 2014)

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

Majoral candidates in Catania (Bennato, Miceli 2013 )

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

Resignation Benedict XVI (Bennato, Miceli 2013 )

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

Festival of Saint Agatha in Catania (Bennato, Miceli 2013 )

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Mentionmapp http://mentionmapp.com/

Research strategy: networking

Metrics: followers, hashtags

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO • Analyzing relationship (SNA approach)

NodeXL: http://nodexl.codeplex.com/

Gephi: https://gephi.org/

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Bigliography

Adamic, L., Glance, N., 2005, The Political Blogosphere and the

2004 U.S. Election: Divided They Blog, LinkKDD '05 Proceedings of

the 3rd international workshop on Link discovery, pp.36-43

Jürgens, P., 2014, Communities of Communication: Making Sense of

the “Social” in social Media, in Bredl, K., Hünninger, J., Jensen, J. L.,

(Eds.), Methods for analyzing Social Media, Routledge, London,

pp.45-62.

Scott, J., 2000, Social Network Analysis, Sage, London.

Wellman, B., 2002, Little Boxes, Glocalization, and Networked

Individualism, in M. Tanabe, P. van den Besselaar, T. Ishida (Eds.),

Digital Cities, Springer-Verlag, Berlin.

Welser H. T. , Smith M., Fisher D., Gleave E., 2008, Distilling digital

traces: Computational social science approaches to Studying the

Internet, in Fielding N., Lee M. L., Blank G., The SAGE Handbook of

online research methods, SAGE, London, pp.116-140.

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Davide Bennato is professor of Sociology of culture and communication and Sociology of digital media at the Department of Humanistic Sciences at the University of Catania.

• He was professor for different italian universities: Roma “La Sapienza”, LUISS, Università di Siena, Università del Molise.

• He is one of the founding members and vice-president (2005-08) of STS-Italia (Science and Technology Studies Italian Association). He is member of the board of Bench s.r.l., a University of Catania spin off in social and marketing researches.

• His research topics are: technological cultures, digital content consumptions, social media interpersonal relationship.

• His studies are based on computational social science, a computer based approach on social relationship and cultural modelling, using social analytics techniques.

• Books: Le metafore del computer. La costruzione sociale dell’informatica (Meltemi, 2002) e Sociologia dei media digitali (Laterza, 2011).

• Books chapters: 2014a, The Open Laboratory: Limits and Possibilities of Using Facebook, Twitter, and YouTube as a Research Data Source, (con F. Giglietto, L. Rossi, in Bredl et al, eds, Methods for analyzing Social media, Routledge, New York), 2014b, Smart City, Smart Data. L’uso dei dati alla ricerca di una città sostenibile, in “Lettera Internazionale”, n.118, pp. 40-43, 2014c, Etica dei Big data. Le conseguenze sociali della raccolta massiva di informazioni, in “Studi culturali”, n.1, pp.86-92, forthcoming, La dataveglianza di massa. Conseguenze etiche e relazionali delle scelte tecnologiche di Facebook, in Greco G., a cura, Pubbliche intimità. L’affettivo quotidiano nei siti di social network, Franco Angeli, Milano, pp.107-118.

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

• Davide Bennato

Sociologia dei media digitali, Laterza, Roma-Bari, 2011

• Millions of people consult and interact with each other through the

use of internet. Each in its own way, participate in the networking

of news, but also to the transformation of these tools of

communication and socialization. Blogs, wikis, social networks are

- above all - tools of social relationship. The participative web then

obliges a profound rethinking of the classical concepts of the

sociology of communication.

• Davide Bennato offers a detailed analysis of the different tools and

platforms well known to the public, from Facebook to Youtube, and

examines the ethical and social consequences of the use of new

technologies.

• The book on internet

website

http://www.sociologiadeimediadigitali.it Facebook fanpage

http://www.facebook.com/sociologiadeimedi

adigitali Twitter

http://twitter.com/mediadigitali

LIP

AR

I 23/0

7/2

014

DAVID

E B

EN

NATO

Socialmedia

http://twitter.com/tecnoetica

http://www.facebook.com/davide.bennato

http://www.linkedin.com/in/davidebennato

http://www.youtube.com/tecnoetica

http://pinterest.com/davidebennato/

Skype

davide.bennato

Blog

www.tecnoetica.it

www.processiculturali.it

www.sociologiadeimediadigitali.it