Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang...

24
Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Transcript of Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang...

Page 1: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Why Watching Movie Tweets Won’t Tell the Whole Story?

Felix Ming-Fai Wong, Soumya Sen, Mung Chiang

EE, Princeton University

1

WOSN’12. August 17, Helsinki.

Page 2: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Twitter: A Gold Mine?

2

Apps: Mombo, TwitCritics, Fflick

Page 3: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Or Too Good to be True?

3

Representativeness???

Page 4: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Why representativeness?

Selection bias Gender Age Education Computer skills Ethnicity Interests

The silent majority

4

Page 5: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Why Movies?

Large online interest

Relatively unambiguous

Right in Timing: Oscars

5

Page 6: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Key Questions

Is there a bias in Twitter movie ratings?

How do Twitters compare to other online site?

Is there a quantifiable difference across types of movies? Oscar nominated vs. non-nominated commercial movies

Can hype ensure Box-office gains?

6

Page 7: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Data Stats

12 Million Tweets 1.77M valid movie tweets

February 2 – March 12, 2012

34 movie titles New releases (20) Oscar-nominees (14)

7

Page 8: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Example Movies

Commercial Movies (non-nominees) Oscar-nominees

The Grey The Descendants

Underworld: Awakening The Artist

Contraband The Help

Haywire Hugo

The Woman in Black Midnight in Paris

Chronicle Moneyball

The Vow The Tree of Life

Journey 2: The Mysterious Island War Horse

8

Oscar-nominees for Best Picture or Best Animated Film

Page 9: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Rating Comparison

Twitter movie ratings Binary Classification: Positivity/Negativity

IMDb, Rotten Tomatoes ratings

Rating scale: 0 – 10, 0 – 5

Benchmark definition Rotten Tomatoes: Positive - 3.5/5 IMDb: Positive: 7/10 Movie score: weighted average High mutual information

9

Page 10: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Challenges

Common Words in titles e.g., “Thanks for the help”, “the grey sky”

No API support for exact matching e.g., “a grey cat in the box”

Misrepresentations e.g., “underworld awakening”

Non-standard vocabulary e.g., “sick movie”

Non-English tweets

Retweets: approves original opinion

10

Page 11: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Tweet Classification

11

Page 12: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Steps

Preprocessing Remove usernames Conversion:

Punctuation marks (e.g., “!” to a meta-word “exclmark”) Emoticons (e,g,, or :), etc. ), URLs, “@”

Feature Vector Conversion Preprocessed tweet to binary feature vector (using MALLET)

Training & Classification 11K non-repeated, randomly sampled tweets manually labeled Trained three classifiers

12

Preprocessing Training Data Classification Analysis

Page 13: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Classifier Comparison

SVM based classifier performance is significantly better

Numbers in brackets are balanced accuracy rates

13

Page 14: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Temporal Context

14

Woman in Black Chronicle The Vow Safe House

Most “current” tweets are “mentions” LBS

“Before” or “After” tweets are much more positive

Fraction of tweets in joint-classes

Page 15: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Sentiment: Positive Bias

15

Woman in Black Chronicle The Vow Safe House Haywire Star Wars I

Twitters are overwhelmingly more positive for almost all movies

Page 16: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Rotten Tomatoes vs. IMDb?

Good match between RT and IMDb (Benchmark)

16

Page 17: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Twitters vs. IMDb vs. RT ratings (1)

New (non-nominated) movies score more positively from Twitter users

17

IMDb vs. Twitter RT vs. Twitter

Page 18: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Twitters vs. IMDb vs. RT ratings (2)

Oscar-nominees rate more positively on IMDb and RT

18

IMDb vs. Twitter RT vs. Twitter

Page 19: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Quantification Metrics (1)

(x*, y*) are the medians of proportion of positive ratings in Twitter and IMDb/RT

19

B

P

Positiveness (P):

Bias (B):IM

Db

/RT s

core

Twitter score

x*

y*

1

1

Page 20: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Quantification Metrics (2)

20

Inferrability (I):

If one knows the average rating on Twitter, how well can he/she predict the ratings on IMDb/RT?

Page 21: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Summary Metrics

21

Oscar-nominees have higher positiveness

RT-IMDb have high mutual inferrability and low bias

Twitter users are more positively biased for non-nominated new releases

Page 22: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Hype-approval vs. Ratings

22

Higher (lower) Hype-approval ≠ Higher (lower) IMDb/RT ratings

The VowThe Vow

This Means WarThis Means War

Page 23: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Hype-approval vs. Box-office

High Hype + High IMDb rating: Box-office success

Other combination: Unpredictable23

Journey 2

Page 24: Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming-Fai Wong, Soumya Sen, Mung Chiang EE, Princeton University 1 WOSN’12. August 17, Helsinki.

Summary

Tweeters aren’t representative enough

Need to be careful about tweets as online polls

Tweeters’ tend to appear more positive, have specific interests and taste

Need for quantitative metrics

Thanks!

24