Pollyanna Gonçalves (UFMG, Brazil) Matheus Araújo (UFMG, Brazil) Fabrício Benevenuto (UFMG,...
-
Upload
fay-benson -
Category
Documents
-
view
220 -
download
1
Transcript of Pollyanna Gonçalves (UFMG, Brazil) Matheus Araújo (UFMG, Brazil) Fabrício Benevenuto (UFMG,...
Pollyanna Gonçalves (UFMG, Brazil) Matheus Araújo (UFMG, Brazil)
Fabrício Benevenuto (UFMG, Brazil) Meeyoung Cha (KAIST, Korea)
Comparing and Combining Sentiment Analysis Methods
Key component of a new wave of applications that explore social network data
Summary of public opinion about: politics, products, services (e.g. a new car, a movie), etc.
Monitor social network data (in real-time) Common as polarity analysis (positive or negative)
Sentiment Analysis on Social Networks
Which method to use? There are several methods proposed for different contexts There are several popular methods Validations based on examples, comparisons with baseline, with use of
limited datasets
There is not a proper comparison among methods Advantages? Disadvantages? Limitations?
Sentiment Analysis Methods
Compare 8 popular sentiment analysis methods Focus on the task of detecting polarity: positive vs. negative
Combine methods
Deploy the methods in a system --- www.ifeel.dcc.ufmg.br
This talk
Ifeel System& Conclusions
Methods & Methodology
Comparing & Combining
Extracted from instant messages services Skype, MSN, Yahoo Messages, etc.
Grouped as positive and negative
Emoticons
Lexical method (paid software)
Allows to optimize the lexical dictionary -> we used the default
Measures various emotional, cognitive, and structural components
We only consider sentiment-relevant categories such as positivity, negativity
Linguistic Inquiry and Word Count (LIWC)
Lexical approach based on the WordNet dictionary Groups words in synonyms
Detects positivity, negativity, and neutrality of texts
SentiWordNet
Lexical method adapted from a psychometric scale
Consists of a dictionary of adjectives associated to sentiments Positive: Joviality, assurance, serenity, and surprise Negative: Fear, sadness, guilt, hostility, shyness and fatigue
PANAS-t
Uses a well-known lexical dictionary namely Affective Norms for English Words (ANEW)
Produces a scale of happiness 1 (extremely happy) to 9 (extremely unhappy)
We consider [1..5) for negative and [5..9] for positive
Happiness Index
Combines 9 supervised machine learning methods
Estimates the strength of positive and negative sentiment in a text
We used the trained model provided by the authors
SentiStrengh
Machine learning method, trained with Naïve Bayes’ model
Trained model implemented as a python library
Classify tweets in JSON format for positive, negative, neutral and unsure
SAIL/AIL Sentiment Analyzer (SASA)
Extract cognitive and affective information using natural language processing techniques
Uses the affective categorization model Hourglass of Emotions
Provides an approach that classify messages as positive and negative
SenticNet
Comparison of coverage and prediction performance across different datasets
Dataset 1: human labeled About 12,000 messages labeled with Amazon Mechanical Turk:
Twitter, MySpace, YouTube and Digg comments, BBC and Runners World forums
Dataset 2: unlabeled Complete snapshot from Twitter (collected in 2009) ~2 billion tweets Extracted tragedies, disasters, movie releases, and political events
Focus on the English messages
Methodology
Ifeel System& Conclusions
Methods & Methodology
Comparing & Combining
What is the coverage of each method?
Coverage vs. Prediction Performance
Emoticons: best prediction and worst coverage SentiStrenght: second in prediction and third in coverage
Prediction Performance across datasets
Twitter MySpace Youtube BBC Digg Runners World
PANAS-t 0.643 0.958 0.737 0.396 0.476 0.698
Emoticons 0.929 0.952 0.948 0.359 0.939 0.947
SASA 0.750 0.710 0.754 0.346 0.502 0.744
SenticNet 0.757 0.884 0.810 0.251 0.424 0.826
SentiWordNet 0.721 0.837 0.789 0.384 0.456 0.780
SentiStrength 0.843 0.915 0.894 0.532 0.632 0.778
Happiness Index 0.774 0.925 0.821 0.246 0.393 0.832
LIWC 0.690 0.862 0.731 0.377 0.585 0.895
Strong variations across datasets
Prediction Performance across datasets
Twitter MySpace Youtube BBC Digg Runners World
PANAS-t 0.643 0.958 0.737 0.396 0.476 0.698
Emoticons 0.929 0.952 0.948 0.359 0.939 0.947
SASA 0.750 0.710 0.754 0.346 0.502 0.744
SenticNet 0.757 0.884 0.810 0.251 0.424 0.826
SentiWordNet 0.721 0.837 0.789 0.384 0.456 0.780
SentiStrength 0.843 0.915 0.894 0.532 0.632 0.778
Happiness Index 0.774 0.925 0.821 0.246 0.393 0.832
LIWC 0.690 0.862 0.731 0.377 0.585 0.895
Worst performance for datasets containing formal text
Polarity Analysis
Detected only positive
Sentiments!
Methods tend to detect more positive sentiments Positive as positive is usually greater than negative as negative
Even disasters were classified
predominantly as positive
Combines 7, of the 8 methods analyzed Emoticons, SentiStrength, Happiness Index, SenticNet, SentiWordNet, PANAS-t, SASA Removed LIWC (paid method)
Weights are distributed according to the rank of prediction performance: Higher weight for the method with highest F-measure Emoticon received weight 7 and PANAS-t 1
Combined Method
Combined Method
Best coverage and second in prediction performance 4 methods combined are sufficient
Ifeel System& Conclusions
Methods & Methodology
Comparing & Combining
Example for: “Feeling too happy today :)“
Deploys all methods, except LIWC
Allows to evaluate an entire file
Allows to change parameters on the methods
iFeel (Beta version)www.ifeel.dcc.ufmg.br
We compare 8 popular sentiment analysis methods for detecting polarity No method had the best results in all analysis Prediction performance largely varies according to the dataset Most methods are biased towards positivity
We propose a combined method Achieves high coverage and high prediction performance
Ifeel: methods deployed and easily available
Future work: Compare others methods like POMS and EMOLEX
Conclusions