Information spreading in FriendFeed

Post on 13-Nov-2014

1.095 views 1 download

description

Presentation done at the 4th research methods festival - July 5th 2010 - Oxford - St. Catrine College

Transcript of Information spreading in FriendFeed

SIGSNA: Special Interest Group on Social Network Analysis Luca Rossi - Fabio GigliettoUniversity of Urbino “Carlo BO”

persistence/easy to search/scalability/ easy to replicate/(boyd 2007)

Background:Growing availability of User Generated Content

High research value of spontaneously produced contents.

persistence scalabilty replicability searchability

few Writings Printing press, newspapers

Digital media (pc, video-cameras)

World Wide Web + Google (Google Book Search)

many Writings Personal online publishing / Web 2.0 (Blogs, Flickr, YouTube)

Digital media (pc, video-cameras)

World Wide Web + Google (Google Blog Search)

The Big (online) Data: New opportunities

- high value of UGCs- huge amount of spontaneous data- large variety of topics- worldwide phenomenon (comparative analysis)

The Big Data: New methodological problems

- getting the data- storing the data - querying the data- analysing the data

} Interdisciplinary approach

© F

lickr

.com

/ G

eekM

om

Heath

er

Getting the data:RSS feeds (content produced) or API (users info).

Last.FM, Twitter, Flickr, Digg, Netlog, YouTube, MySpace…

Contacts, status, profile, TopUsed…

© F

lickr

.com

/ G

eekM

om

Heath

er

- Legal/ethical issues- Terms of use

Storing the data:

SIGSNA (two weeks of FriendFeed public data)10.500.000 posts (2GB text data). ≃

500.000 likes. ≃ 450.000 users. ≃

15 million subscriptions. ≃

© F

lickr

.com

/ a

man

ders

on2

from WOW20 to SIGSNA:Working with online user generated content for Sociological Research

WOW20 (2007) SIGSNA (2009)

Social Media Blogs FriendFeed

Type of data Public RSS feed Public RSS feed

Database Relational DB Relational DB

Extras Scraping tecniques Language identification

Amount of data 3000 blog entries 10.454.195 FF post*

* Entries and comments

Summary:

data cleaning

examples:

≠Heidi: 1974 Anime based on Johanna Spyri’s novel.

Heidi: 1973 Top Model

Querying the data Case study: SIGSNA research on breaking news propagation on Friendfeed

Mike Bongiorno (famous Italian TV host) died on Sept. 8 2010. The news stroke Friendfeed at 01.57 PM:- First entry >130 comments- All entries > 585 comments

How news propagate?What kind of behaviours?

Using timestamps and network of followers we have been able to track the propagation paths identifying major hubs.

Long propagation chains No propagation

Short propagation chains

Explicit news sharing is followed by chatting and discussion. This kind of activity contribute to news propagation

”Bye Mike! We’re missing you!Bye granpa Mike!Mike, you’ve been a milestone of our TV

First entry has the highest informative function

Most commented entry is a long and articulated discussion

More info, papers and data:http://larica.uniurb.it/sigsna

SIGSNA is a joint research project with the department of Computer Science of the University of Bologna (Dr. Matteo Magnani) and it is partially founded by Telecom Italia.