Talk of the City: Londoners and Social Media

Post on 21-Jun-2015

426 views 1 download

Tags:

description

talk of the city http://tinyurl.com/cctxbzo tracking emotions in the city http://tinyurl.com/7uvjasy

Transcript of Talk of the City: Londoners and Social Media

Londoners and Social Media:Track Community “Happiness” + Target Ads

@danielequercia

<who am i>

daniele quercia

offline & online

<goal>

social media language personality

social media

social media

<why>

social media

social media Pop press pundits (Archbishop England&Walses)“Social-networking sites “dehumanize” community life”

social media

social media 1Q&A

social media 2Q&A

social media 3Q&A

social media CS Researchers:“Twitter is NOT a social network but a news media”

social media Pop press pundits (Archbishop England&Wales):“Social-networking sites “dehumanize” community life”

CS Researchers:“Twitter is NOT a social network but a news media”

social media Pop press pundits (Archbishop England&Wales)“Social-networking sites “dehumanize” community life”

CS Researchers:“Twitter is NOT a social network but a news media”

“I beg to diff

er” ;-)

social media language personality

social media

community deprivation well-being use of words

?

community deprivation well-being use of words

community deprivation well-being use of words

3 match sentiment with (census) deprivation

2 classify sentiment of profiles

1 collect profiles & geo-reference them

Goal

community deprivation well-being use of words

250K profiles in London (31.5M tweets)

3 seeds: newspaper accounts

1 collect profiles & geo-reference them

1,323 in London neighborhoods 573 in 51 neighborhoods

Word Count vs. Maximum Entropy

2 classify sentiment of profiles

Word Count

social media language personality

social media language personality

social media language personality

Max Entropy

Training? Upon 300K tweets with smiley and frowny faces

Word Count vs. Max Entropy

Word Count vs. Max Entropy

Index of Multiple Deprivation

3 match sentiment with (census) deprivation

r=.350 word count r=.365 MaxEnt

predicting socioeconomic well-being with twitter

[CSCW’12] Tracking Gross Community Happiness from Tweets

Going beyond sentiment … Look at the subject matter of tweets!

Extract topics from tweets. Easiest way?

Matching Keywords

Extract topics from tweets. Easiest way?

Matching Keywords

Dictionary of keywords? A machine learning model? Training?

Use machine learning model (no training required)

Latent Dirichlet Allocation (LDA)

read profiles & define topics

create virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words if co-occur more than chance: keep them in the bin else: put them into another bin (@ random)

read profiles & define topics

create virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words if co-occur more than chance: keep them in the bin else: put them into another bin (@ random)

Facebook Twitter

read profiles & define topics

create virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words if co-occur more than chance: keep them in the bin else: put them into another bin (@ random)

Facebook Twitter

social

econometrics

read profiles & define topics

create virtual bins (latent topics)assign words to a bin (@ random)for each bin: select pair of words if co-occur more than chance: keep them in the bin else: put them into another bin (@ random)

Facebook Twitter

social

econometrics

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA)

social media environment sports health wedding parties

Spanish/Portuguesecelebrity gossips

Support Vector Regression IMD <- SVR(topics) accuracy: 8.14 in [13.12,46.88]

Some areas have very few profiles! residents +

Some areas have very few profiles! residents + visitors

Analyze geo-referenced tweets(not only residents but also visitors)

Linear Regression R2=.49 (49% of IMD variability explained)

So what?

Theoretical Implications

Practical Implications

Ads and the City:Considering Geographic Distance Goes a Long Way

Problem Statement: Given a venue (new bar/restaurant), suggests guests

Problem Statement: Given a venue (new bar/restaurant), suggests guests

Problem Statement: Given a venue (new bar/restaurant), suggests guests

Web ≠ people move!

Web ≠ people move!

On people mobility (from the literature):

1) likes might matter 2) distance matters 3) “power users” are special

On people mobility (from the literature):

1) likes might matter 2) distance matters 3) “power users” are special

On people mobility (from the literature):

1) likes might matter 2) distance matters 3) “power users” are special

The extent one is a power user ;)

HIGH α travel farther

HIGH α travel farther

1) Naïve Bayesian2) Bayesian3) Linear Regression (learn weights)

(2)

(2)

(2)

(2)

(2)

(2)

(2)

Future (well, current & you could help)

1 complex buildings

“Who talks to whom”

Network

2 tools for topical & sentiment analysis

3

3

3 urbanopticon.org

2 Tools for topical & sentiment analysis

1 Complex Buildings

@danielequercia