1
Identifying Emotions in Tweets related to the Brazilian Stock
MarketPhD thesis current status
Fernando J. V. da SilvaSupervisors: Ariadne M. B. R. Carvalho
and Norton T. Roman
IC – Institute of Computing
2
Content
Motivation Objectives The Plutchik Wheel of Emotions Research Methodology Current Progress Corpora Manual Annotation Preliminary Analysis of the corpora
3
Motivation
Small investors use Twitter to discuss their trading operations
#bbas3 depois acho meu post mas ainda aguardo 18,75 [#bbas3 I will find my post latter but I still wait for 18.75]
Acordo da ALLL3 melou? [Did the ALLL3 agreement is gone?]
Aí meu bolso... Bbas3 caiu pra c****** hoje[Ouch, my wallet... Bbas3 felt as (expletive) today]
Trade de venda no gráfico semanal acaba de ser acionado em BBAS3 [Weekly sell trade sign has just been detected in BBAS3]
4
Motivation
Several previous works have found correlation between tweets and stock market indexes (Bollen et al. 2011, Gaskell et al. 2013, Zhang et al. 2011)
Previous works found correlation with Emotions on tweets:
Fewer emotions indicates increases on DJIA (Zhang et al. 2011)
Calm moods are good for predicting DJIA (Bollen et al. 2011)
5
Motivation
The automatic identification of emotions on tweets could help predicting the stock market
There is no similar research for Brazilian stock market
6
Objectives
Identify emotions on Portuguese tweets
Apply the same technique to tweets related to the Brazilian stock market
7
The Plutchik Wheel of Emotions
A Psychoevolutionary theory by Robert Plutchik (Plutchik and Kellerman, 1986)
The concept of emotion is applicable to all evolutionary levels and apply to animals and humans
It defines 8 basic emotions grouped on 4 pairsJoy vs SadnessFear vs AngerTrust vs DisgustSurprise vs Anticipation
8
Research Methodology Use machine learning techniques similar
to (Suttles and Ide, 2013) with one classifier for each pair of opposite emotions:
− Joy vs Sad− Anger vs Fear− Trust vs Disgust− Anticipation vs Surprise
If probability to be any of these classes is too small, then a tweet is classified as “Neutral”
9
Research Methodology (cont.)
Train the algorithm using a bigger corpus of “context-free” tweets
Test using a “specific-context” corpus of stock market related tweets
Use a SVM tree kernel such as in (Agarwal et. al., 2011) to compare tweets structure instead of words frequencies
10
Research Methodology (cont.)
Tree representation sample for tweet “@Fernando this isn't a great day for playing the HARP! :)” - from (Agarwal et. al. 2011)
11
Research Methodology (cont.)
Manual annotation
Emotion Identification using Machine Learning with SVM tree kernel
Emotion Identification using Machine Learning and n-grams attributes (Benchmark)
Tweets collection
12
Research Current Progress
Manual annotation(In Progress)
Emotion Identification using Machine Learning with SVM tree kernel
(To Do)
Emotion Identification using Machine Learning and n-grams attributes (Benchmark)
(To Do)
Tweets collection(Done)
13
Corpora (cont.)
Specific-Context corpus: 2,402 non-repeated tweets containing one of the 73 IBOVESPA stock market codes (i.e. petr4 for Petrobras, bbas3 for Banco do Brasil, etc)
− Manually annotated by 2 people
14
Corpora
Context-Free corpus: 26,407 non-repeated tweets automatically collected from Twitter and automatically annotated according to hashtags (Distant Supervision). Ex:
− #feliz (happy) → joy tweet− #triste (sad) → sad tweet
15
Manual Annotation Process
Process inspired by (Suttles and Ide 2013), identify emotions according to Plutchik's wheel of emotions
Each tweet is marked with up to 4 emotions or neutral (joy or sadness, anger or fear, trust or disgust, anticipation or surprise)
16
Manual Annotation Process (cont.)
A simple command line tool was developed to help on the annotation
17
Preliminary Analysis of the Corpora
Using word frequencies to help answer some questions:
− Do tweets really differ in opposite pairs of emotions?
− How similar are tweets with the same emotion in different corpora?
− Can EmoLex (Mohammad, 2013) terms help identify emotions?
19
Do tweets differ in pairs of emotions?
Joy Sad
Trust Disgust
Anger Fear
Anticipation Surprise
Context-Free corpus
20
How similar are tweets in different corpora?
Joy Sad Anger Fear
Con
text
-Fre
e co
rpus
Spe
cific
-Con
text
cor
pus
21
How similar are tweets in different corpora?
Trust Disgust Anticipation Surprise
Con
text
-Fre
e co
rpus
Spe
cific
-Con
text
cor
pus
22
Can EmoLex terms help on emotion identification?
What is EmoLex?− Research by (Mohammad, 2013)− 14,182 unigrams (words) associated to
emotions− Manually created by crowdsourcing− Available in 20 languages (including
Portuguese)
27
Conclusions
Tweets annotated with opposite emotions differ on their most frequent words
But tweets with the same emotion don't share their most frequent words on the two corpora
EmoLex terms' frequencies vary according to emotion and may be usefull as attributes
28
Next Steps
Develop a web-based tool for a “crowdsourcing” annotation
Conduct machine learning experiments for emotion identification using n-grams as attributes – To be used as a benchmark
Create tree representations for the tweets
Conduct experiments using tree representations for emotion identification using a SVM tree kernel as in (Agarwal et. al. 2011)
Compare results in the two corpora
30
References (Agarwal et. al. 2011) Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca Passonneau.
Sentiment analysis of twitter data. In Proceedings of the Workshop on Languages in Social Media, pages 30–38. Association for Computational Linguistics, 2011. 7, 13
(Bollen et. al. 2011) Johan Bollen, Huina Mao, and Xiaojun Zeng. Twitter mood predicts the stock market. Journal of Computational Science, 2(1):1–8, 2011.
(Gaskell et. al. 2013) Paul Gaskell, Frank McGroarty, and Thanassis Tiropanis. An investigation into correlations between financial sentiment and prices in financial markets. In Proceedings of the 5th Annual ACM Web Science Conference, pages 99–108. ACM, 2013.
(Mohammad, 2013) Saif M. Mohammad and Peter D. Turney. Crowdsourcing a word-emotion association lexicon. 29(3):436–465, 2013.
(Suttles and Ide 2013) Jared Suttles and Nancy Ide. Distant supervision for emotion classification with discrete binary values. In Computational Linguistics and Intelligent Text Processing, pages 121–136. Springer, 2013.
(Plutchik and Kellerman 1986) Robert Plutchik and Henry Kellerman. Emotion: theory, research and experience. Acad. Press, 1986.
(Zhang et. al. 2011) Xue Zhang, Hauke Fuehres, and Peter A. Gloor. Predicting stock market indicators through twitter “i hope it is not as bad as i fear”. Procedia - Social and Behavioral Sciences, 26(0):55 – 62, 2011. The 2nd Collaborative Innovation Networks Conference - COINs2010.
31
Emotions and BehavioursStimulus Event Cognite appraisal Subjective
reactionBehavioral reaction
function
threat “danger” Fear escape safety
obstacle “enemy” Anger Attack Destroy obstacle
Gain of valued object
“possess” Joy Retain or repeat Gain resources
Loss of valued object
“abandonment” Sadness Cry Reattach to lost object
Member of one's group
“friendship” Trust groom Mutual support
Unpalatable object
“poison” Disgust vomit Eject poison
New territory “examine” Anticipation map Knowledge of territory
Unexpected event
“what is it?” Surprise Stop Gain time to orient
32
Analogy to stock market investors
Evento Estímulo Cognição Estado Emocional
Comportamento manifestado
Efeito
ameaça à uma ação observada
“risco de prejuízo” Medo Fugir (vender ação ou não comprar)
Segurança
Obstáculo (para o rendimento de uma ação)
“inimigo que causa prejuízo”
Raiva se possível, agir contra, senão apenas indignar-se
Destruir ou contornar obstáculo
Ganhar lucros “possessão” Alegria Reter ou repetir Ganhar recursos
Prejuízo “prejuízo” Tristeza lamentar Compensar prejuízo
Investidor ou empresa de confiança
“amizade” Confiança Seguir conselhos ou comprar ações da empresa
Ajuda para tentar obter lucros
Cenário ruim ou Ação com desempenho muito ruim
“risco de prejuízo” Desgosto Manter distância Reduzir riscos
Novo cenário (mudança de preço esperada)
“examinar” Antecipação Operar de acordo com o previsto
Conhecimento do cenário futuro
Evento inesperado “o que é?” Surpresa parar Ganhar tempo para se orientar
Top Related