CS626- Speech, NLP, Web Harnessing Annotation …pb/cs626-2014/cs626-lect37...Sentiment Annotation:...

Abhijit MIshra

CS626- Speech, NLP, WebHarnessing Annotation Process Data for

Natural Language ProcessingAn investigation based on eye-tracking

Lecture 37Presented By:Abhijit Mishra

Roll no. : 114056002

Under the guidance of:Prof. Pushpak Bhattacharyya (PhD advisor)

Prof. Michael Carl (Mentor)

3rd Nov, 2014, Venue: CFILT Acknowledgements: Aditya J., Nivvedan S.

Abhijit Mishra

BackgroundEye-movement data as a form of annotation

Translation Complexity and Sentiment Annotation Complexity measurementEye-movement data for cognitive modeling

Extracting “signature scanpaths” to study trends in linguistic task oriented readingUser specific topic modeling using eye-gaze information

Eye-movement data for cognitive studies in linguisticsStudy of subjectivity extraction in sentiment annotation

Conclusion and Future work

Roadmap

3rd Nov, 2014 Roadmap 2

Abhijit Mishra

Joshi, Aditya and Mishra, Abhijit and S., Nivvedan and Bhattacharyya, Pushpak.2014. Measuring Sentiment Annotation Complexity of Text. Association for Computational Linguistics, Baltimore, USA.Mishra, Abhijit and Joshi, Aditya and Bhattacharyya, Pushpak. 2014. A cognitive study of subjectivity

extraction in sentiment annotation. 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA) , ACL, 2014, Baltimore, USAMishra, Abhijit and Singhal, Shubham and Bhattacharyya, Pushpak. 2014. Combining Scanpaths to identify

Common Eye-Movement Strategies in Linguistic-Task Oriented Reading. Under reviewMishra, Abhijit and Bhattacharyya, Pushpak and Carl, Michael. 2013. Automatically Predicting Sentence

Translation Difficulty. Association for Computational Linguistics, Sofia, BulgariaMishra, A and Carl, M and Bhattacharyya, P.2012. A heuristic-based approach for systematic error correction of

gaze data for reading. Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, Mumbai, India. The COLING

Additional collaborative workKunchukuttan Anoop and Mishra, Abhijit and Chatterjee, Rajen and Shah, Ritesh and Bhattacharyya, Pushpak,

Shata-Anuvadak: Tackling Multiway Translation of Indian Languages, LREC 2014, Rekjyavik, Iceland, 26-31 May, 2014

Publications

Publications 33rd Nov, 2014

Abhijit Mishra

The eye-tracking and sentiment database created at IIT Bombay as a part of this work, several scripts for data processing and analysis are released under Creative Commons License.

http://http://www.cfilt.iitb.ac.in/˜cognitive-nlp/The predictive frameworks for Sentiment Annotation Complexity and Translation Complexity computation are made available as web-services at

http://www.cfilt.iitb.ac.in/TCI http://www.cfilt.iitb.ac.in/SAC.

Released Resources and Tools

Resources and tools 43rd Nov, 2014

BACKGROUND

3rd Nov, 2014 5

Abhijit Mishra

NLP state of the art:Machine learning + LinguisticsSupervised/Semi-supervised methods are popular and relatively accurate.

The BIG Picture

The big picture 6

RawData

Annotation Process Data (Gaze patterns, Key-stroke sequence)

Training DataAnnotation MODEL

Raw Data

Prediction(Information

regarding how humans predict)

Features

Labels

3rd Nov, 2014

Abhijit Mishra

Annotation Process Data – A byproduct of Annotation

Background 7

Annotation- It is a task of labelling text, images or other data with comments, explanation, tags or markups.

Example: The movie was good.Part of Speech Annotation: The/DT movie/NN was/VBD good/JJ ./.Translation (to Hindi) : िफ म अ छी थी.Sentiment Annotation: Positive

Annotation involves visualization, comprehension (understanding) and production (producing annotations) requiring “reading” and “typing”Annotation Process Data – Organized representation of such activities.

3rd Nov, 2014

Abhijit Mishra

Data representing reading and writing activities.Gaze data

• Gaze points : Position of eye-gaze on the screen• Fixations : A long stay of the gaze on a particular object on the screen.

Fixations have both Spatial (coordinates) and Temporal (duration) properties.• Saccade : A very rapid movement of eye between the positions of rest.• Scanpath: A path connecting a series of fixations.• Regression: Revisiting a previously read segment

Keystrokes • Insertion: Insertion of a word or character inside running text.• Deletion : Deleting a word or character.• Selection: Highlighting• Rearrangement : Dragging or Copy/Pasting highlighted text.

Other types of data like EEG signals, speech etc.

Annotation Process Data

Background 83rd Nov, 2014

Abhijit Mishra

Human eye movement is poised between perception and cognition.The eye movement pattern during goal oriented reading is driven by

Perceptual properties of the textCognitive processes underlying language processing

Eye-movement is controlled by the “occipital lobe” in the brain. The duration of fixation and the saccadic distance and direction (progression/regression) vary based on the complexity of information to be processed (ref: Neuroergonomics: The brain at work by Parasuraman and Matthew, 2008).

Motivation : “Cognition, Linguistic Complexity and Eye-movement are related.”

Annotation and Eye-tracking


Abhijit Mishra

PsycholinguisticsBicknell and Levy (2010) who model eye-movement control of readers. Using Bayesian inference on sentence identity, their model predicts how long to fixate on the current position and where to fixate next.Demberg and Keller (2008) and Boston et al. (2008) relate eye-movement during reading to the underlying syntactic complexity of text.Emotion word processing by Graham G. Scott and Sereno (2012)

Computational LinguisticsKliegl (2011) established a technique to predict word frequency and pattern from eye movements.Doherty et. al (2010) introduced eye-tracking as an automatic Machine Translation Evaluation Technique.Stymne et al. (2012) explained eye-tracking as a tool for Machine Translation (MT) error analysis in which they identified and classified MT errors.Dragsted (2010) observed co-ordination of reading and writing process during translation. Joshi et al. (2011) studied the cognitive aspects of sense annotation process using eye-tracking along with a sense marking tool.

Literature


PhD THEME

3rd Nov, 2014 11

Abhijit Mishra

• Using “shallow” cognitive information from eye-tracking data.• Three different scenarios where eye-tracking data can be used

PhD Theme

3rd Nov, 2014 PhD theme 12

Eye-movement data as a form of annotation

ModelingEye-movement

data

Eye movement data to find out strategies employed by humans

for language processing

a. Useful where direct manual annotation is unintuitive and prone to subjectivity

b. Problems addressed: Predicting Translation Complexity, Sentiment Annotation Complexity

a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.

b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis

a. Modeling of eye-movement data for

a. User profiling and user-specific modeling

b. Finding out reading trends.

b. Problems addressed: User specific Topic Modeling, Generating consensus eye-gaze patterns

Abhijit Mishra


PhD Theme – Part 1

3rd Nov, 2014 PhD theme 13



data











EYE-MOVEMENT DATA AS ANNOTATION

3rd Nov, 2014 14

Towards measuring and Predicting Annotation Complexity

Abhijit Mishra

Eye- movement data - “Subconscious Annotation”

Objective: To consider this form of annotation for tasks for which manual/direct annotation is “unintuitive” and “subjective”

Example: • Assigning fluency/adequacy scores to a translation, • Giving readability/translatability scores to paragraphs

Using eye-gaze parameters as ‘subconscious annotation’ we propose frameworks to measure and predictTranslation Complexity of textSentiment Annotation Complexity of text

Introduction

3rd Nov, 2014 Annotation Complexity 15

Abhijit Mishra

Translation Complexity Index (TCI): A measure of inherent complexity in text translation.Application:

Categorization of sentence into different level of difficulty.Can be used for better translation cost modeling in a crowdsourcing/outsourcing scenario.Can provide a way of monitoring the progress of second language learners.

Study 1: Predicting Translation Complexity of Text

3rd Nov, 2014 Translation Complexity 16

Length is not a good indicator of translation difficulty.

Example: 1. The camera-man shot the policeman with a gun. (Length = 8)2. I was returning from my old office yesterday. (Length = 8)

Sentence 1 is lexically and structurally ambiguous due to the presence of polysemous word “shot” and the prepositional phrase attachment.

Abhijit Mishra

Translation Complexity Index: Insight


Abhijit Mishra

Framework for Prediction of TCI


Training data Regressor

Labeling through translator’s eye-tracking information Linguistic and Translation

Features

Test Data

TCI

• Direct manual labeling of training examples are fraught with subjectivity. • Labeling by “time for which translation related processing is carried out by the brain”, or “Translation Processing

Time (Tp). Tp is the total fixation and

푇 = 푑푢푟 푓 +∈

푑푢푟 푠 +∈

푑푢푟 푓 +∈

푑푢푟 푠∈

TCImeasured = Tp / sentence_length• TCImeasured is then mapped to a score between 1-10 using MinMax normalization (higher the score,

greated the complexity)

Abhijit Mishra

Lexical FeaturesSentence Length (L)

• Word count• Intuition: Lengthy sentence are more complex to translate

Degree of Polysemy (DP)• Average Senses per word, as per WordNet• Intuition: The more polysemous a word is; the harder it would be to disambiguate the sense

Out of vocabulary measures (OOV) • Percentage of words not present in General Service List (GSL) and Academic Word List (AWL)• Intuition: Words not present in the working vocabulary of the translator would clearly pose challenges to

translation.Average syllables per word (SPW)

• Intuition : SPW is an indicator of readability. Readability relates to translatability. Fraction of Noun, Verb and Preposition (PNJ) - by Trial and ErrorPresence of Digits (DIG) - by Trial and ErrorNamed entity count (NEC) - Selection of features by Trial and Error

Linguistic Features for TCI


Abhijit Mishra

Syntactic FeaturesLin’s Structural Complexity (SC):

• Mean of the total length of the dependency links appearing in the dependency parse tree of a sentence(Lin, 1996)

• SC for the example sentence = 15/7 = 2.14• Intuition: the farther apart syntactically linked elementsare, the harder it would be to parse and comprehend

the sentence.Non-Terminal to Terminal Ratio (NTR):

• The ratio of non-terminals to terminals in the constituency parse-tree of a sentence• Example sentence “It is possible that this is a tough sentence”(ROOT (S (NP (PRP It)) (VP (VBZ is) (ADJP (JJ possible)) (SBAR (IN that) (S (NP (DT this)) (VP (VBZ is) (NP (DT a) (JJ tough) (NN sentence))))))))The Non-terminal to Terminal Ratio is thus 10 / 9 = 1.11.• Intuition: the ratio would be higher for sentences with nested structures which add to syntactic difficulty

and thus translation complexity.

Linguistic Features for TCI (2)


Abhijit Mishra

Semantic FeaturesCo-reference Distance (CRD)

• The sum of distances, in number of words, between all pairs of co-referring text segments in a sentence.

• Example: John and Mary live together but she likes cats while he likes dogs. (CRD = 14)• Intuition: Large portion of text has to be kept in translator’s working memory if CRD is high.

Count of Discourse connectors (DSC) • Discourse connectors are those linking words or phrases that connect multiple discoursesof different semantic content to ring about semantic coherence. (e.g. Although, However etc.)• Intuition: Since the presence of discourse connectors semantically links two discourses, a

translator is required to have the old discourse in active working memory.Passive Clause Count (PCC)

• The number of clauses in passive voice, in a sentence. • Example: The house is guarded by the dog that is taken care of by the home owner. (PCC = 2)• Intuition: Intuitively, it was felt that passive voice is harder to translate than active voice

Linguistic Features for TCI(3)


Abhijit Mishra

Semantic Features (2)Height of hypernymy (HH)

• The path-length to the common hypernymy (parental node) in WordNet. • Example: Adaptation and mitigation efforts must therefore go hand (HH = 4.94)• Intuition: It is indicative of the level of abstractness or specificity of a sentence. Not much

insight• Perplexity (PX)

• Perplexity is the degree of uncertainty of N-grams in a sentences.• A highly perplexed N-gram induces a higher degree of surprise and slows down the process of

comprehension.• For our experiments, we computed trigram perplexity of sentences using language models

trained on a mixture of sentences form Brown corpus, a corpus containing more than one million words.



Abhijit Mishra

Translation Feature:Translation Model Entropy (TME)

• Translation model entropy of a phrase expresses the uncertainty involved in selecting a candidate translation of a source phrase from a set of possible translations. For a source phrase r with T as a set containing all the possible translations with translation probabilities summing to 1the translation entropy H(r) is:

퐻 푠 = − 푃 푡 푠) ∗ log푃 푡 푠∈

• TME of a sentence (according to Kohen et. al.) is computed as follows푇푀퐸 푆푒푛푡 = min

∀퐻 푅

퐻 푅 = 퐻(푟 , 푟 , 푟 , … . . 푟 ) = ∑ 퐻(푟)Here H(R) is the joint entropy of each of the component of R which is the sum of the entropy of each phrase ( by independence assumption) . • Intuitively, if the TME is high, it should take more time even for humans to decide between the

available translation options.



Abhijit Mishra

Correlation: Features and Measured TCI


For 80 sentences extracted from the TPR database.

Abhijit Mishra

Training data consists of 80 examples obtained from the TPR database (Carl 2012). We applied Support Vector Regression (Joachims et al., 1999) with different kernels and “trade-off” parameter C.

Table 1 Table 2 Comparison between Mean Square Error (MSE) between old framework (Table – 1, considering only L,

DP, SC) and new framework (Table- 2 considering all the features)

Experiment and Results


Abhijit Mishra

L DP SC NTR CRDDSC HH NEC DIG OOVPCC PN PV PJ SPW PX TME0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Features

Mea

n S

quar

ed E

rror

Ablation Test


Abhijit Mishra

Observation of correlation between Machine Translation Quality Estimates (METEOR and TER) and TCI. Obtained Google’s translations for 80 sentences. Compared with reference translation from TPR database to compute TER and METEOR scores,

Man vs Machine :: TCI vs MT Quality estimates


TCI not very well correlated, but some features are (can they be used in MT systems) to get better output?

Abhijit Mishra

Central Message: The effort required for a human annotator to detect sentiment is not uniform for all texts. Our metric called “Sentiment Annotation Complexity (SAC)” quantifies this effort, using linguistic properties of text.Example: Just what I wanted: a good pizza versus Just what I wanted: a cold pizzaComplexity in sentiment at

Lexical level Syntactic levelSemantic and pragmatic level

Study 2: Measuring Sentiment Annotation Complexity

3rd Nov, 2014 Sentiment Annotation Complexity 28

It is messy, uncouth,incomprehensible, vicious

andabsurd.

A somewhat crudely constructed but

gripping, questing look at a person

so racked with self-loathing, hebecomes an enemy to his own

race.

It’s like an all-star salute to disney’s

cheesy commercialism.

Abhijit Mishra

Framework for Prediction of SAC


Training data Regressor

Labeling through translator’s eye-tracking information Linguistic and sentiment

features

Test Data

SAC

• TCI like approach.• SAC of a piece of text measured using “Total fixated time” as follows:

Abhijit Mishra

There is no existing database containing eye-tracking information and sentiment annotation (like TPR for translation). We created annotated data as follows:

Five annotators read 566 sentences from movie review dataset and 493 sentences from a tweet dataset (total: 1059)Annotators were asked to give polarities (Positive/Negative/Objective) to each of the textsWhile they annotated, the eye-movement data were recorded using an eye-tracker and Translog-II software

SAC is measured using the previously discussed equation.

Collection of eye-tracking and SA database


Abhijit Mishra

Statistics related to the eye-tracking data


Table 1: Corpus statistics

Table 2: Annotator’s reading speed (Average fixation duration per word)

Figure 1 : Distribution of measured and normalized SAC (between 0-10) 1059 documents. Here each document is a sentence Table 3: Annotation agreement level

Abhijit Mishra

Linguistic Features for SAC


Abhijit Mishra

Correlation: Feature and measured SAC


Abhijit Mishra

Support Vector Regression for predicting SAC using linguistic features.

Prediction framework for SAC


Table 4: Performance of Predictive Framework for 5-fold in-domain and cross-domain validationusing Mean Squared Error (MSE), Mean Absolute Error (MAE) and Mean PercentageError (MPE) estimates and correlation with the gold labels.

Abhijit Mishra

Ablation tests


Abhijit Mishra

Error in parsing: Movie reviews sometimes ungrammatical, tweets mostly ungrammatical. SC and NTR features are flawed due to incorrect parsing.

Example: Just saw the new Night at the Museum movie(Stanford parser unable to resolve PP attachment ambiguity, making it a verb attachment)

Error in Co-reference resolution: Anaphora is not resolved appropriately by Stanford CoreNLPfor most cases. Co-reference distance computation is flawed.

Error in Sentiment Feature ExtractionSentiWordNet used for this purpose. It is automatically annotated and flawed.In order to calculate sentiment features using SentiWordNet, one needs Word Sense Disambiguation (WSD).

Example: I checked the score and it score was 20 love. The word “love” is used here in the sense of “a score of zero in tennis or squash”, whichis the fifth sense for “love” in Wordnet. It is not sentiment bearing (may be taken as a positive word if the first sense it considered).

Error Analysis


Abhijit Mishra

We wanted to check if the confidence scores of a sentiment classifier are negatively correlated with SAC, implying that something which is difficult for humans is also difficult for machines.Three P/N/O classifiers were trained using Naïve Bayes (NB), Maximum Entropy (MaxEnt) and Support Vector Classification (SVC) techniques. Training data – Random 10000 movie reviews from Amazon Corpus and 20000 tweets from twitter corpusFeatures- Presence/absence of Unigram, Bigram, Trigrams + SAC featuresTest data – All 566 movie reviews and 493 tweets from eye-tracking databaseClassifiers’ confidence:

Man vs Machine: SAC vs Classifier Confidence


Abhijit Mishra

Man vs Machine: SAC vs Classifier Confidence


SAC based on human readings is able to capture the difficulty experienced by classifiers as well.

Abhijit Mishra

Goal: If different classifiers can be used to handle text with different levels SAC We categorized all the sentences in the training data into three groups: Easy (SAC 0.1-3), Medium (3.1-7) and Hard (SAC 7.1-10).Accuracies of classifiers (in %) given in the table below.

No conclusion could be drawn from this experiment

SAC and Ensemble Classifiers


Abhijit Mishra

Till now we have considered gaze fixation/saccade duration as a measure of complexity for SAC and TCI.Fixation duration: Still prone to subjectivity and distractionsCan other eye-movement features be exploited?A small experiment was done taking Saccadic distance and Regression count in the SAC formulation

A better way of measuring SAC/TCI


Abhijit Mishra

Qualitative analysis of modified SAC


# Sentence SACold SACnew

1 it is messy , uncouth , incomprehensible , vicious and absurd

3.3 3.1

2 the drama was so uninspiring that even a story immersed in love , lust , and sin couldn’t keep my attention.

3.44 1.06

3 it’s like an all-star salute to disney’s cheesy commercialism.

8.3 10

For many examples scoring using the modified SAC formula gives better ranking of sentences based on complexity.

Abhijit Mishra

Support Vector Regression was applied using the same features and modified SAC labels.Results – More erroneous than the previous

Error analysis, better feature engineering

Predicting SAC with modified SAC labels


Abhijit Mishra

In this section, we Demonstrated how eye-tracking data can be taken as a form of sub-conscious annotation.Applied annotation obtained from eye-tracking to build predictive framework for the prediction of Translation and Sentiment Annotation Complexity of text.

Summary


Abhijit Mishra



3rd Nov, 2014 44











data



EYE-MOVEMENT DATA FOR COGNITIVE MODELING

3rd Nov, 2014 45

Abhijit Mishra

Text + Eye-gaze information = MultimodalityCan this multimodal data be modeled for

Automatically figuring out the “reading trend” with respect to specific linguistic subtleties ?User profiling and user-specific modeling to find out personal preferences?

Introduction

3rd Nov, 2014 Signature Scanpaths 46

Abhijit Mishra

Eye-tracking studies:Till now gaze-features like fixations, saccades have been used.Scanpaths can be more powerful since they comprise fixations, saccades (in terms of gaze progressions and regressions) [Malsburget.al. 2012]Scanpaths Line graphs:• Fixations as nodes• Saccades are edges

Study 3: Synthesizing “signature scanpaths” that represent consensus eye-movement patterns


Abhijit Mishra

Vary from task to task

Nature of scanpaths


Reading Surfing Viewing

Abhijit Mishra

Vary with GoalPerson Time

Nature of scanpaths (Reading)


Eye-movement trajectories (Scanpaths) for different task oriented reading. Each rows correspond to a scanpaths of a single human subject. Columns (a), (b) and (c) refer scanpaths for tasks of Sentiment Analysis, Summarization and Translation respectively.

Sentence: I hate movie trailers that spoil all the movie also. But Dark City is a great movie, I havent seen it in blue ray yet. But I think it has a very good story, kind of dark (as the title says). My type of story.

Abhijit Mishra

Goal: To a common eye-movement representation or “signature scanpath” from N individual scanpaths corresponding to N readers performing a “task oriented” reading.Utility:

Represents a common behavior. Helps explaining linguistic phenomena in the lights of common perception. Reduction of noise and subjectivity.Can be used to predict the expected eye-movement trajectory during reading• Helpful for UI design• Computational Advertisement

Signature Scanpaths


Abhijit Mishra

Expected output

3rd Nov, 2014 51

Figure 3: Combining Scanpaths

Abhijit Mishra

1. Clustering of fixations to find common Regions of Interest based on spatial information (position of the word in terms of cluster count). Each region is assigned with an id called ROI_id

2. Modeling Markov Chains to get the most likely sequence of ROI_id. These sequence will represent the skeleton of the signature scanpath.

3. Compute the duration of each fixation in the final scanpath using interpolation.

Method


Abhijit Mishra

One dimentional Clustering of fixations to find Common ROIs. Cursor distance used for distance matrix computation.K-mean clustering algorithm is used for clustering. The value of K changes dynamically as per the length of the considered sentence.The words/phrases in each common ROI is believed to exhibit peculiar characteristics (e.g. elements causing sarcasm)

Clustering to find common region of interest


Figure: Clustering of fixations(Features- word positons mapped to Fixations)

Abhijit Mishra

Objective:

푅푂퐼 = 푎푟푔푚푎푥 푃 푅푂퐼 퐹 ∝ 푃 퐹 푅푂퐼 × 푃 푅푂퐼

= ∏ 푃(푓 |푡 ) × 푃(푡 |푡 ) … … … … … … … … … … . 1(Using independence assumptions and Markov assumptions)

Where,푁 → 퐴푣푒푟푎푔푒푛푢푚푏푒푟표푓푓푖푥푎푡푖표푛푠푅푂퐼,퐹 → 푆푒푞푢푒푛푐푒표푓푙푎푏푒푙푠 푡 , 푡 , … . . , 푡 푎푛푑

퐹푖푥푎푡푖표푛푠(푓 , 푓 , … . .푓 )푟푒푠푝푒푐푡푖푣푒푙푦

Note that “ROI set” comprises Cluster IDs from 1,2…..k


Applying HMM(1)

Abhijit Mishra

Emission Probabilities P(fi|ti):P(fi|ti) = count (fixations, ti) / Total fixations.

Transition Probabilities P(ti|ti-1)P(ti|ti-1) = count(Transitions from, ti-1to ti) / Total transitions from ti-1 to other ROIs.

Decoding using ViterbiHeuristic: Time based segmentation : after a certain time interval T the transition and emission probabilities are updated.

• To deal with zero probability values, we apply ‘add delta’ smoothing

(Flaw: We have not included valuable information like “distinct users visiting to the same ROI” ,”multiple times visiting the same ROI)

Applying HMM(2)


Abhijit Mishra

Experiment Setup


• Datasets: Dundee readability corpus, IITB Sentiment Analysis paragraph and sentence level corpora, TPR database translation Post-editing dataset for English-Hindi, English-Spanish• Dataset statistics given below (Ann represented number of readers/annotators)

Table: Statistics of the data used

Abhijit Mishra

ChallengesHard to collect gold standard consensus scanpaths against which the algorithm output can be compared.There is no other existing literature reporting such algorithms against which a comparison can be done

Validation Methods:Quantitative validationQualitative validation

Validation


Abhijit Mishra

Validation 1: By Fixation-Word Overlap EstimationWe extract all the words underlying the fixated regions in all individual scanpaths and the signature scanpath.For each word in the signature scanpath, we check what fraction of the population fixate on the same word as expressed by the individual scanpaths. Intuition: “What percentage of the collectively focused regions are reflected in the signature scanpath?

Validation 2: By Measuring Variance in scanpath distanceWe find out the distance of the consensus scanpath from each individual scanpath using Scasim (von der Malsburg et al., 2012). Scasim is a modified edit-distance based similarity measure that considers both spatial(co-ordinates) and and temporal(durations) properties of the fixations to calculate distance between two scanpaths. We compute the pair-wise distance between the consensus scanpath and each individual scanpathand then compute the variance in distance.Intuition: A consensus scanpath is believed to have taken components from every individual scanpath if the variance of the pair-wise distance is low.

Qualitative Validation


Abhijit Mishra

Quantitative Validation


TABLE: Validation results for different datasets shown in terms of Word overlap (AOIol) andRelative Standard Deviation (Diststdev)

Abhijit Mishra

Modeling does not consider many factors (like individual vs collective eye-transitions, eye-regressions etc.)Algorithms have to be compared against a baselineQualitative Evaluation: Study if variation in signature is caused by the underlying linguistic characteristics.Modeling using random walk theory ?

Scopes for improvements


Abhijit Mishra

Topic model: A set of algorithms that discover latent topics in a data set based on observed words.Simple topic models: Latent Dirichlet allocation based modelsRecently started working (in collaboration with Aditya) on

User profiling and user-specific modeling (Incorporating eye-gaze information for term weighting of LDAs. Experiment going on )Utility of such personalized models• Recommendation systems• Readers’ sentiment Analysis• In classroom teaching where different teaching standards can be adopted based

on overlaps in individual topics of interest.

Study 4: User specific Topic Models

3rd Nov, 2014 Topic Models 61

Abhijit Mishra



3rd Nov, 2014 62











data



COGNITIVE STUDY OF SUBJECTIVITY EXTRACTION IN SENTIMENT

ANNOTATION

3rd Nov, 2014 63

Abhijit Mishra

Subjectivity extraction: Extracting subjective (sentiment bearing portions) from a text before predicting the sentiment polarity.Conducted eye-tracking experiments involving humans annotating paragraphs with sentiment labelsTwo different kinds of subjective document

Linear sentiment : Each subjective sentences follow the same sentiment through-outOscillating sentiment: Sentiment flips throughout the document.

Two distinct strategies employed by humans for the task of sentiment analysis-(a) Subjectivity Extraction through Anticipation, where sentences following a series of either positive negative polar sentences are skipped and the sentiment is anticipated without reading the whole document (b)Subjectivity Extraction through Homing, where some portions in the complex documents (with sentiment flips) are rigorously revisited even after a complete pass of reading.

(*A detailed research is being carried out)

Study 5: Subjectivity extraction during sentiment annotation

3rd Nov, 2014 Subjectivity Extraction 64

CONCLUSION AND FUTURE PLANS

3rd Nov, 2014 65

Abhijit Mishra

Three different ways to utilize shallow cognitive information from eye-trackingFocused on annotation complexity- measurement, prediction and applicationProposed and implemented a method to extract signature eye movement patterns to study reading trends. Proposed a to obtain user specific topic modes by “term-weighting” bacic LDA models.Cognitive study explaining “Homing” and “Anticipation” strategies during sentiment annotation.

Conclusion – Work done so far (or ongoing)

3rd Nov, 2014 Conclusions 66

Abhijit Mishra

Towards Generalization of Annotation ComplexityStudy can be applied to other NLP tasks like WSDLeveraging unlabeled data to address data scarcity problem – transductive svr(experiments going on)

Extraction of “signature scanpaths”Better formulation of the synthesis stepProper validation against baseline

User specific topic modelingValidation of the topics using quantitative and qualitative evaluation

Modeling and predicting ‘pause’ placement in social media text. Does pause affect sentiment? Grounding through cognitive studiesPrediction framework for automatic pause insertion.

Future Work

3rd Nov, 2014 Future work 67

THANK YOU

3rd Nov, 2014 68

Abhijit Mishra

Parasuraman, Raja, and Matthew Rizzo. Neuroergonomics: The brain at work. Oxford University Press, Inc., 2008.Klinton Bicknell and Roger Levy. 2010. A rational model of eye movement control in reading. In Proceedings of the 48th annual meeting of the association for computational linguistics, pages 1168–1178. Association for Computational Linguistics.Steven Bird. 2006. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL on Interactive presentation sessions, pages 69–72. Association for Computational Linguistics.David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research, 3:993–1022.Marisa Boston, John Hale, Reinhold Kliegl, Umesh Patil, and Shravan Vasishth.2008. Parsing costs as predictors of reading difficulty: An evaluation using the potsdam sentence corpus. Mind Research RepositoryMichael Carl. 2012a. The CRITT-TPR-DB 1.0: A database for empirical human translation process research. AMTA.Workshop on Post-Editing Technology and PracticeMichael Carl. 2012b. Translog-II: a program for recording user activity data for empirical reading and writing research. In LREC, pages 4108–4112.

References

3rd Nov, 2014 69

Abhijit Mishra

Vera Demberg and Frank Keller. 2008. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition, 109(2):193210.Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 85–91. Association for Computational LinguisticsStephen Doherty, Sharon OBrien, and Michael Carl. 2010. Eye tracking as an MT evaluation technique. Machine translation, 24(1):1–13Barbara Dragsted. 2010. Coordination of reading and writing processes in translation. Translation and cognition, 15:41.Andrea Esuli and Fabrizio Sebastiani. 2006. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, volume 6, pages 417–422.Kar¨en Fort, Adeline Nazarenko, Sophie Rosset, et al. 2012. Modeling the complexity of manual annotation tasks: A grid of analysis. In Proceedings of the International Conference on Computational Linguistics (COLING 2012), pages 895–910.Joseph Goldberg and Jonathan Helfman. 2010. Scanpath clustering and aggregation. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, pages 227–234. ACM

References

3rd Nov, 2014 70

Abhijit Mishra

Patrick J. O’Donnell Graham G. Scott and Sara C. Sereno.2012. Emotion words affect eye fixations during reading. Journal of Experimental Psychology:Learning, Memory, and Cognition, 38(3):783Thorsten Joachims. 2006. Training linear svms in linear time. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 217–226. ACMSalil Joshi, Diptesh Kanojia, and Pushpak Bhattacharyya. 2013. More than meets the eye: Study of human cognition in sense annotation. In Proceedings of NAACLHLT, pages 733–738Dekang Lin. 1996. On the structural complexity of natural language sentences. In Proceedings of the 16th conference on Computational linguistics-Volume 2, pages 729–733. Association for Computational LinguisticsPascual Martınez-G´omez and Akiko Aizawa. 2013. Diagnosing causes of reading difficulty using bayesian networks. IJCNLP, Japan

References

3rd Nov, 2014 71

CS626- Speech, NLP, Web Harnessing Annotation …pb/cs626-2014/cs626-lect37...Sentiment Annotation:...

Documents

Transcript of CS626- Speech, NLP, Web Harnessing Annotation …pb/cs626-2014/cs626-lect37...Sentiment Annotation:...