Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Wednesday 18 February 2009 The...

59
Ann Devitt Trinity College Dublin Wednesday 18 February 2009 Wednesday 18 February 2009 Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid Ahmad
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Wednesday 18 February 2009 The...

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Wednesday 18 February 2009Wednesday 18 February 2009

The Languages of Emotion and Financial NewsAnn Devitt

Khurshid Ahmad

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Sentiment and the MarketsSentiment and the Markets

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Sentiment and the MarketsSentiment and the Markets

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Global chipmakers, battling slower technology

demand, are betting size matters as they pin their

hopes for future growth on small and easy to

carry mobile devices such as netbooks and

smartphones.

Specialised Language of Financial Specialised Language of Financial NewsNews

Bloomberg.com, 18/2/09

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Global chipmakers, battling slower

technology demand, are betting size matters as

they pin their hopes for future growth on small

and easy to carry mobile devices such as

netbooks and smartphones.

Specialised Language of Financial Specialised Language of Financial NewsNews

Bloomberg.com, 18/2/09

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Global chipmakers, battling slower technology

demand, are betting size matters as they pin their

hopes for future growth on small and

easy to carry mobile devices such as netbooks

and smartphones.

Specialised Language of Financial Specialised Language of Financial NewsNews

Bloomberg.com, 18/2/09

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Global chipmakers, battling slower

technology demand, are betting size matters as

they pin their hopes for future growth

on small and easy to carry mobile devices

such as netbooks and smartphones.

Specialised Language of Financial Specialised Language of Financial NewsNews

Bloomberg.com, 18/2/09

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Sentiment and the MarketsSentiment and the Markets

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Sentiment and the MarketsSentiment and the Markets

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Engle Ng (1993) Asymmetry CurveEngle Ng (1993) Asymmetry Curve

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

OutlineOutline

Current psychological theory of emotion

Evaluation of lexical “emotion” resources

Corpus analysis of language of “emotion”

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

OutlineOutline

Current psychological theory of emotion

Evaluation of lexical “emotion” resources

Corpus analysis of language of “emotion”

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Cognitive Theory of Emotion:Cognitive Theory of Emotion:CategoricalCategorical

Ekman (1975)

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Cognitive Theory of Emotion:Cognitive Theory of Emotion:DimensionsDimensions

Osgood / RussellEvaluationActivityPotency

Mehabrian PADPleasureActivationDominance

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Cognitive Theory of EmotionCognitive Theory of Emotion

Watson and Tellegen (1985)

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

OutlineOutline

Current psychological theory of emotion

Evaluation of lexical “emotion” resources

Corpus analysis of language of “emotion”

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource EvaluationLexical Resource Evaluation

SentiWordNet

WhisselWNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Whissel

Lexical Resource Evaluation Lexical Resource Evaluation Senti WordNetSenti WordNet

SentiWordNet

WNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource Evaluation Lexical Resource Evaluation Senti WordNetSenti WordNet

Word PositiveVal NegativeValHappy 0.9 0.0Sad 0.0 0.9

39066 termsEvaluation dimension scale: 0 - 1Low average: Pos=0.18, Neg=0.23More extreme Neg valuesError-prone: rude (pos 0.875), gladsome (neg 0.875)

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource Evaluation Lexical Resource Evaluation General InquirerGeneral Inquirer

SentiWordNet

WhisselWNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource Evaluation Lexical Resource Evaluation General InquirerGeneral Inquirer

ECSTATIC Pos PleasureSORROWFUL Neg Pain

Hand-coded, content analysis basis8641 terms184 binary categories (including MAB dimensions)Negative > PositiveActive > PassiveStrong > Weak

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource Evaluation Lexical Resource Evaluation Whissel Dictionary of AffectWhissel Dictionary of Affect

SentiWordNet

WhisselWNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource Evaluation Lexical Resource Evaluation Whissel Dictionary of AffectWhissel Dictionary of Affect

Word Eval ActivImag

great 2.6250 2.1250 1.0

disastrous 1.4444 2.4000 2.0Corpus selection, hand-coded8742 termsDimensional representation: 1-3 scaleEvaluation, Activation, Imagery

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource Evaluation Lexical Resource Evaluation WordNet AffectWordNet Affect

SentiWordNet

WhisselWNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource EvaluationLexical Resource EvaluationWordNet AffectWordNet AffectWord BinaryFeatures

Loneliness cognitive state, emotionHappiness cognitive state, emotion

5432 termsDomains of emotional experienceNo PolarityShort-term: Mood, MannerLong-term: Attribute, Trait

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource EvaluationLexical Resource EvaluationLexical OverlapLexical Overlap

Are the lexica consistent?

Are they mutually exclusive?

Dice, Jaccard, Asymmetric coefficients

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Whissel

Lexical Resource EvaluationLexical Resource Evaluation Lexical OverlapLexical Overlap

SentiWordNet

WNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

SentiWordNet

Lexical Resource EvaluationLexical Resource Evaluation Lexical OverlapLexical Overlap

General Inquirer

Whissel

WNA

1. Statistically significant agreement for Polarity Assignment (Chi square test)

2. Very weak correlation for activation features.

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

SentiWordNet

Lexical Resource EvaluationLexical Resource Evaluation Lexical OverlapLexical Overlap

General Inquirer

Whissel

WNA

1. Weak correlation of SWN with Whissel evaluation

2. No correlation with Whissel activation dimension

3. SWN positive negatively correlated with imageability

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

SentiWordNet

Lexical Resource EvaluationLexical Resource Evaluation Lexical OverlapLexical Overlap

WNAGeneral InquirerWhissel

1. SWN tends to negative for short term WNA features

2. SWN tends to positive for long-term WNA features

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Whissel

Lexical Resource EvaluationLexical Resource Evaluation Lexical OverlapLexical Overlap

SentiWordNet

WNA

General Inquirer

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource EvaluationLexical Resource Evaluation Lexical OverlapLexical Overlap

WNA feature division:Short-term Long-term

Negative Positive

Physical Cognitive

More active Less active

Internal External

Less abstract More concrete

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Lexical Resource EvaluationLexical Resource EvaluationSome conclusionsSome conclusions

The lexica:Are quite consistent

Can be used in combination

SentiWN: Largely unexplored territory

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

OutlineOutline

Current psychological theory of emotion

Evaluation of lexical “emotion” resources

Corpus analysis of language of “emotion”

General Language

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Emotion in General LanguageEmotion in General LanguageCorpus Study AimsCorpus Study Aims

Does “emotion” constitute a distinct sub-language?

Is there a polarity bias in General Language? (the Polyanna Hypothesis of Boucher and Osgood)

What is the impact of using different lexica?

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisThe DataThe Data

BNC100 million words

Balanced, broad corpus

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisMethodologyMethodology

Is emotion a distinct sub-language?

Examine distribution type

Examine distribution spread

Bootstrap sampling distribution

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisDistribution TypeDistribution Type

Zipfian: BNC=Emotion Lexica

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisDistribution shapeDistribution shape

Comparison of means: student t-test

BNC ≠ Emotion Lexica (p<0.000)

Different sample means5-30 times more frequent than gen. languageAssumptions of test?

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisBootstrap Sampling DistributionBootstrap Sampling Distribution

Are sentiment-bearing terms a statistically distinct and highly frequent subset of English?

1000 random samples of terms from BNCSample size = size of sentiment lexiconH0: Observed sample falls inside within 95% of bootstrap random sampling distribution of means

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisBootstrap Sampling DistributionBootstrap Sampling Distribution

Are sentiment-bearing terms a statistically distinct and highly frequent subset of English?

For all lexica: Mean term frequency of lexicon well outside 95%Sentiment lexica are not representative of BNC (p<0.05)

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisSentiment FeaturesSentiment Features

Is there a polarity bias in General Language?

Positive polarity biasStatistically significant for all lexica (χ2 test of independence)

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisSentiment FeaturesSentiment Features

Is there a polarity bias in General Language when you include intensity of polarity?

Positive polarity biasStatistically significant for all lexicaχ2 = 158.5, df=1, p<0.0001 for General Inquirerχ2 = 63.6, df=1, p<0.0001 for Whissel

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Corpus AnalysisCorpus AnalysisSome conclusionsSome conclusions

Sentiment-bearing terms are a distinct subset of English

Positive polarity bias in BNC

General Inquirer and Whissel: Low coverage and high frequency

SentiwordNet:Wide coverage and much lower frequency

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

OutlineOutline

Current psychological theory of emotion

Evaluation of lexical “emotion” resources

Corpus analysis of language of “emotion”Comparative

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisAimsAims

Examine affective term use

Identify statistically different distributions

Is there a dominant feature/polarity?

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisThe DataThe Data

Financial Language

2 million words

On-line financial news:•Reuters, CNN, Bloomberg

•Newspapers

General Language

BNC100 million words

Balanced, broad corpus

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus Analysis The Data The Data

BNC sub-corpora

Imaginative written English16 million words

Informative written English70 million words

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisMethodologyMethodology

Compare proportions of Sentiment Features

χ2 Test of Independence

H0: π FinCorpus = π BNC

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus Analysis Methodology Methodology

Statistical significance of different proportionχ2 > 7.8794p >= 0.005

Features: 41 Lexicon Sentiment Features from 4 lexicaFrequency per million words

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisFinancial CorpusFinancial Corpus

WRT Imaginative: More affective terms

WRT Informative: Many more affective terms

WRT BNC: Dependent on feature typeDistributions are statistically distinct

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisPositive GI FeaturesPositive GI Features

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisPositive GI FeaturesPositive GI Features

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisNegative GI FeaturesNegative GI Features

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisNegative GI FeaturesNegative GI Features

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Comparative Corpus AnalysisComparative Corpus AnalysisNegative GI FeaturesNegative GI Features

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Some conclusionsSome conclusions

Lexical resources for sentiment are consistent

Financial news is a sub-language: Affective content is statistically distinct relative to general language

Text polarity is asymmetric, positive skewDifferent skews for different domains

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Something to think aboutSomething to think about

If different language varieties and domains have distinct use of sentiment terms and their own polarity bias:Individual sentiment values are not informativeSo what do we need??

Ann DevittTrinity College Dublin

Wednesday 18 February 2009

Thank You!Thank You!

[email protected]