Twitterology

49
www.nicolaperra.com Nicola Perra TWITTEROLOGY

Transcript of Twitterology

www.nicolaperra.comNicola Perra

TWITTEROLOGY

TWITTER’S FACTS

TWITTER’S FACTS

255 million users active (monthly)

TWITTER’S FACTS

255 million users active (monthly)500 million tweets per day

TWITTER’S FACTS

255 million users active (monthly)500 million tweets per day78% of users are active on mobile devices

TWITTER’S FACTS

255 million users active (monthly)500 million tweets per day78% of users are active on mobile devices77% of accounts are outside the U.S.

TWITTER’S FACTS

255 million users active (monthly)500 million tweets per day78% of users are active on mobile devices77% of accounts are outside the U.S.585 gallons (>2000 liters) of coffee per week

THE REAL QUESTION

What can we do with it??!! THE REAL QUESTION

ANSWER QUESTIONS ABOUT US

ANSWER QUESTIONS ABOUT US

How large is the circle of our friends?

ANSWER QUESTIONS ABOUT US

How large is the circle of our friends?

The social brain hypothesis:

ANSWER QUESTIONS ABOUT US

How large is the circle of our friends?

The social brain hypothesis:Typical social group size determined by neocortical size

ANSWER QUESTIONS ABOUT US

How large is the circle of our friends?

The social brain hypothesis:Typical social group size determined by neocortical sizeMeasured in various primates, extrapolated for humans: 100-200 (Dunbar’s Number)

VALIDATION OF DUNBAR’S NUMBER IN TWITTER CONVERSATIONS

By using 380 millions @ messages of about 1.7 millions users, we built the reciprocated weighted network

A) B)

1Alice

2Bob

5Cathy

3Dan

4Alice

6Bob

7Bob

9Cathy

10Ellie

11Bob

A

B C

E

D

= 2 = 1 = 2 = 1

koutkin

winwout

= 3 = 3 = 4 = 3

koutkin

winwout

= 1 = 1 = 1 = 2

koutkin

winwout

= 1 = 1 = 1 = 1

koutkin

winwout

= 0 = 1 = 0 = 1

koutkin

winwout

B. Goncalves, N. Perra, A. Vespignani, Modeling Users' Activity on Twitter Networks: Validation of Dunbar's Number, PLoS ONE 6(8), 2011

VALIDATION OF DUNBAR’S NUMBER IN TWITTER CONVERSATIONS

0 50 100 150 200 250 300 350 400 450 500 550 600

12

34

56

78

tout

kout

A)

0 50 100 150 200 250 300 350 400 450 500 550 600

0100

200

300

400

500

600

50150

250

350

450

550

kin

l

B)

!out

i

=P

j

!ij

kout

i

Aver

age

Weig

ht p

er C

onne

ctio

n

B. Goncalves, N. Perra, A. Vespignani, Modeling Users' Activity on Twitter Networks: Validation of Dunbar's Number, PLoS ONE 6(8), 2011

VALIDATION OF DUNBAR’S NUMBER IN TWITTER CONVERSATIONS

0 50 100 150 200 250 300 350 400 450 500 550 600

12

34

56

78

tout

kout

A)

0 50 100 150 200 250 300 350 400 450 500 550 600

0100

200

300

400

500

600

50150

250

350

450

550

kin

l

B)

!out

i

=P

j

!ij

kout

i

Aver

age

Weig

ht p

er C

onne

ctio

n

Number of connections for which interaction strength is highest

B. Goncalves, N. Perra, A. Vespignani, Modeling Users' Activity on Twitter Networks: Validation of Dunbar's Number, PLoS ONE 6(8), 2011

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

564 days of data collections (Twitter’s gardenhose)

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

564 days of data collections (Twitter’s gardenhose)~650 K Tweets/day with live GPS

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

564 days of data collections (Twitter’s gardenhose)~650 K Tweets/day with live GPS~ 6 M of users

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

564 days of data collections (Twitter’s gardenhose)~650 K Tweets/day with live GPS~ 6 M of users 191 countries (110 analyzed)

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

564 days of data collections (Twitter’s gardenhose)~650 K Tweets/day with live GPS~ 6 M of users 191 countries (110 analyzed)Language detected 78 (Using Chromium)

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

MAPPING THE USE OF LANGUAGES

D. Mocanu, A. Baronchelli, N. Perra, B. Goncalves, A. Vespignani, The Twitter of Babel: Mapping World Languages through Microblogging Platforms, PLoS ONE, 8(4), 2013

UNDERSTANDING ELECTIONS

UNDERSTANDING ELECTIONS

UNDERSTANDING ELECTIONS

UNDERSTANDING ELECTIONS

PREDICT THE SEASONAL FLU

PREDICT THE SEASONAL FLU

Two modeling techniques: fits VS generative models

PREDICT THE SEASONAL FLU

Two modeling techniques: fits VS generative modelsHow can we merge the two approaches?

PREDICT THE SEASONAL FLU

Two modeling techniques: fits VS generative modelsHow can we merge the two approaches?

PREDICT THE SEASONAL FLU

Two modeling techniques: fits VS generative modelsHow can we merge the two approaches?

PREDICT THE SEASONAL FLU

PREDICT THE SEASONAL FLU

Extracting features of geographical locations, languages, and key words from Twitter, Google data, and ILI trend from CDC data.

Calibrating generative models with multivariate fit.

Stochastic simulations

Inputs

STAGE 1 STAGE 2 STAGE 3

ForecastsParameters selection

A

A

B

B

C

C E

E

D

D

Twitter data

Google dataMultivariate fit

Generative models

Analyzing the forcasting results with CDC data in the past seasons

Forecasting

CDC

AND MORE…

AND MORE…

Predicting the results of popular votes (American Idol). F. Ciulla et al, EPJ Data Science, 1, 8, 2012

AND MORE…

Predicting the results of popular votes (American Idol). F. Ciulla et al, EPJ Data Science, 1, 8, 2012Understanding human communications patterns (work in progress)

AND MORE…

Predicting the results of popular votes (American Idol). F. Ciulla et al, EPJ Data Science, 1, 8, 2012Understanding human communications patterns (work in progress)Understanding the spreading of # (work in progress)

AND MORE…

Predicting the results of popular votes (American Idol). F. Ciulla et al, EPJ Data Science, 1, 8, 2012Understanding human communications patterns (work in progress)Understanding the spreading of # (work in progress)Mapping cultural differences (work in progress)

CONCLUSIONS

CONCLUSIONS

A lot has been done…

CONCLUSIONS

A lot has been done…

WHAT SHALL WE DO NEXT??

THANK [email protected]

www.nicolaperra.com