Social text sentiment and tone analysis [aai 201] - (4160)

40
April 10-12 | Chicago, IL Social Text, Sentiment and Tone Analysis Advanced Analytics and Insights Ruben Pertusa Lopez, SolidQ, Business Intelligence DPE & MAP 2013 ([email protected]) Paco Gonzalez, SolidQ, Mentor ([email protected])

Transcript of Social text sentiment and tone analysis [aai 201] - (4160)

Page 1: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12 | Chicago, IL

Social Text, Sentiment and Tone AnalysisAdvanced Analytics and Insights

Ruben Pertusa Lopez, SolidQ, Business Intelligence DPE & MAP 2013 ([email protected])

Paco Gonzalez, SolidQ, Mentor ([email protected])

Page 2: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12 | Chicago, IL

Please silence cell phones

Page 3: Social text sentiment and tone analysis [aai 201] - (4160)

Paco & Ruben

• Conference Speakers

• Based in Europe

• Book Authors

• PHD Candidate on Data Mining

• Project Managers

• Microsoft Certified Professionals

3

Page 4: Social text sentiment and tone analysis [aai 201] - (4160)

Goals

• This session will help us understand how to analyzesentiment / tone using cutting edge Microsoft Technologies

• This is NOT a research session on NLP (Natural LanguageProcessing). Don’t be scared! ☺

4

Page 5: Social text sentiment and tone analysis [aai 201] - (4160)

Agenda

• Overview of Sentiment Analysis

• Gathering and Storing Data

• Sentiment Analysis Techniques

• Business Analytics for Sentiment & Structured data,

together

5

Page 6: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12 | Chicago, IL

Overview of SentimentAnalysis

Page 7: Social text sentiment and tone analysis [aai 201] - (4160)

What is Sentiment?

Feelings, Opinions, Emotions

• Like

• Dislike

• Good

• Bad

• …

7

Page 8: Social text sentiment and tone analysis [aai 201] - (4160)

Some examples

8

paulo @paulomors

Barcelona Beating Milan 4-0 and a gallon of Coke, best day of the year ☺

αυѕтιη ✰✰✰✰ @AustinJohnsto

A 2 hour delay is like when a restaurant only has Pepsi products. It'll work but ill still be very disappointed.�

Trace @trace_haf

Drinking more Coke than water every day does no good for you health.

DONALD @donald150

My budget decides whether coke or pepsi, always buy the cheaper

☺☺☺☺

����

☺☺☺☺?����?����

Page 9: Social text sentiment and tone analysis [aai 201] - (4160)

Not only Twitter!

9

The Walking Dead Season 3 (3300 customer reviews)

Awesome season opener! October 15, 2012 By K. Erwin

After sticking through season two, which was alot of looking for a little girl and standing around on a farm waiting for something to happen, I hoped that they'd pick up the pace a little with season three. The premiere doesn't dissapoint!

The Walking Dead Season 3 Mid-Premiere (72K Likes, 7K comments)

Timothy Berteau You guys are crazy. One boring episode and you think the show sucks, did you forget how awesome the two episodes before it were? Not every episode isgoing to be action packed. Season 1 and 2 had some very boring moments as well.

18 February at 00:58 via mobile · 5 likes

Page 10: Social text sentiment and tone analysis [aai 201] - (4160)

DEMOLooking at some Twitter Sentiment

10

Page 11: Social text sentiment and tone analysis [aai 201] - (4160)

Surrounded by opinions

11

12 Tbday

21 PbHadoopcluster

7 PbMonth

(search queries info)

1 TbTweets

day

75 MiScores

day

Millions of opinions

4 BGraph

edg/day

7 TbData day

Page 12: Social text sentiment and tone analysis [aai 201] - (4160)

Valuable Information for our Business

Questions

• Is this review positive or negative?

• Do this Twitter User like or dislike my new show?

• What are they saying about our company or services?

• How are Facebook User’s Attitudes about the next

election?

12

Is that the only valuable info?

Page 13: Social text sentiment and tone analysis [aai 201] - (4160)

What is Sentiment Analysis?

Text Categorization (Opinion Mining)

• Positive / Negative / Neutral

• 1, 2, 3, 4, 5 Stars

• For / Against

13

TextSentiment Analysis

TechniquesCategoryIN

Photos

Videos, etc.

OUT

Page 14: Social text sentiment and tone analysis [aai 201] - (4160)

DEMOTwitter for Analytics

14

Page 15: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12 | Chicago, IL

Gathering and Storing Data

Page 16: Social text sentiment and tone analysis [aai 201] - (4160)

Gathering data from sources

Public/Private APIs

• Limitations

• Privacy

• Format & Structure of source data

• Updates

Webcrawlers? Call the cops! �

16

Page 17: Social text sentiment and tone analysis [aai 201] - (4160)

How may it look?

JSON Example (1 tweet)

17

Page 18: Social text sentiment and tone analysis [aai 201] - (4160)

Storing the gathered data

Why?• Valuable data!

• Store now. Figure out later

How?• Structured (Relational Database – SQL Server)

• Semi-Structured (Hadoop Cluster – Microsoft HDInsight)

18

Page 19: Social text sentiment and tone analysis [aai 201] - (4160)

Doctor, We need some help

The 4 V’s

19

Volume

Velocity

Variety

Variability

Page 20: Social text sentiment and tone analysis [aai 201] - (4160)

Doctor, We need some help

The 4 V’s

20

Volume

Velocity

Variety

Variability

Page 21: Social text sentiment and tone analysis [aai 201] - (4160)

Start building our system

21

ExtractTransform

Load

ExtractLoad

Page 22: Social text sentiment and tone analysis [aai 201] - (4160)

DEMOLet’s gather some Twitter data ☺

22

Page 23: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12 | Chicago, IL

Sentiment AnalysisTechniques

Page 24: Social text sentiment and tone analysis [aai 201] - (4160)

Understanding the problem

Identify relevant parts

• Nouns

• Adjectives/Adverbs

• Verbs

Drop anything else

24

Data != Sentiment Data

Page 25: Social text sentiment and tone analysis [aai 201] - (4160)

Sentiment Analysis Techniques

Techniques• Natural Language Processing

• Basic Statistics• Clustering

• Fuzzy Components

• Classification

• Estimation

ToolsHDInsight, SSIS, SSAS DM, FullText Search (Semantic Search)

25

Page 26: Social text sentiment and tone analysis [aai 201] - (4160)

Dictionaries

Match words

“Coke is the best & coolest drink on the market”

Some Other Database Example

SentiWordNethttp://sentiwordnet.isti.cnr.it/

26

Tone Index DictionaryPositive 10 amazingPositive 9 awesomePositive 8 bestPositive 7 excellentPositive 6 excitingPositive 5 greatPositive 4 goodPositive 3 rocksPositive 2 coolPositive 1 :)Neutral 0Negative -1 :(Negative -2 poorNegative -3 badNegative -4 criticized Negative -5 attacked Negative -6 humiliatedNegative -7 sucksNegative -8 terribleNegative -9 horribleNegative -10 worthless

Page 27: Social text sentiment and tone analysis [aai 201] - (4160)

A closer look at the example

27

Tone Index DictionaryPositive 10 amazingPositive 9 awesomePositive 8 bestPositive 7 excellentPositive 6 excitingPositive 5 greatPositive 4 goodPositive 3 rocksPositive 2 coolPositive 1 :)Neutral 0Negative -1 :(Negative -2 poor

“Coke is the best & coolest drink on the market”

Fuzzy Match

“Best” matches “best” =+ 8

“coolest” matches “cool” =+ 2

Total = + 10

☺☺☺☺

Page 28: Social text sentiment and tone analysis [aai 201] - (4160)

DEMOGive me your sentiment!

28

Page 29: Social text sentiment and tone analysis [aai 201] - (4160)

Sentiment Analysis Challenges

Phrase level polarity

• “The US fears happy caffeine consumers”.

Context polarity

• Context: The date that Steve Jobs died.

• Opinion 1: “This is the worst day of my life”.

• Opinion 2: “The world is a better place today”.

Irony, Sarcasm, Irregular forms, Pragmatic information…

29

Page 30: Social text sentiment and tone analysis [aai 201] - (4160)

It looks like…

30

ExtractTransform

Load

ExtractLoad

SA Techniques

☺ � �

Page 31: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12 | Chicago, IL

Combining SentimentAnalysis with StructuredData

Page 32: Social text sentiment and tone analysis [aai 201] - (4160)

MORE Valuable Information for ourBusiness

Questions• Is this positive/negative review affecting my sales?

• Have we increased our sales because of positive tweets? Can we learn anything from opinions to improve our products?

• How are people responding to this campaign?

32

Page 33: Social text sentiment and tone analysis [aai 201] - (4160)

Bring our cubes into a new world

New ways of analysis

Better decisions

• Drive next campaign, new wave of products, etc.

Estimate future figures

33

☺ � � +

Page 34: Social text sentiment and tone analysis [aai 201] - (4160)

DEMOOur cubes have feelings ☺

34

Page 35: Social text sentiment and tone analysis [aai 201] - (4160)

Final picture

35

ExtractTransform

Load

ExtractLoad

SA Techniques

☺ � � +

Page 36: Social text sentiment and tone analysis [aai 201] - (4160)

Brief Summary

During this session..• Business Insights from social opinions

• Sentiment Analysis loves Big Data ☺

• Microsoft Technologies can help us with some SA Techniques

• There is a huge competitive advantage by using Sentiment Analysis

36

Page 37: Social text sentiment and tone analysis [aai 201] - (4160)

QUESTIONSPositive / Negative / Neutral ones ☺

37

Page 38: Social text sentiment and tone analysis [aai 201] - (4160)

Contact us!

Rubén Pertusa López ([email protected])

@rpertusaData Platform Engineer, SolidQ

Microsoft Active Professional 2013

Paco González ([email protected])

@pacosqlMentor, SolidQ

38

Page 39: Social text sentiment and tone analysis [aai 201] - (4160)

Win a Microsoft Surface Pro!

Complete an online SESSION EVALUATION to be entered into the draw.

Draw closes April 12, 11:59pm CTWinners will be announced on the PASS BA Conference website and on Twitter.

Go to passbaconference.com/evals or follow the QR code link displayed on session signage throughout the conference venue.

Your feedback is important and valuable. All feedback will be used to improve and select sessions for future events.

Page 40: Social text sentiment and tone analysis [aai 201] - (4160)

April 10-12, Chicago, IL

Thank you!Diamond Sponsor Platinum Sponsor