CS626- Speech, NLP, Web Harnessing Annotation …pb/cs626-2014/cs626-lect37...Sentiment Annotation:...
Transcript of CS626- Speech, NLP, Web Harnessing Annotation …pb/cs626-2014/cs626-lect37...Sentiment Annotation:...
Abhijit MIshra
CS626- Speech, NLP, WebHarnessing Annotation Process Data for
Natural Language ProcessingAn investigation based on eye-tracking
Lecture 37Presented By:Abhijit Mishra
Roll no. : 114056002
Under the guidance of:Prof. Pushpak Bhattacharyya (PhD advisor)
Prof. Michael Carl (Mentor)
3rd Nov, 2014, Venue: CFILT Acknowledgements: Aditya J., Nivvedan S.
Abhijit Mishra
BackgroundEye-movement data as a form of annotation
Translation Complexity and Sentiment Annotation Complexity measurementEye-movement data for cognitive modeling
Extracting “signature scanpaths” to study trends in linguistic task oriented readingUser specific topic modeling using eye-gaze information
Eye-movement data for cognitive studies in linguisticsStudy of subjectivity extraction in sentiment annotation
Conclusion and Future work
Roadmap
3rd Nov, 2014 Roadmap 2
Abhijit Mishra
Joshi, Aditya and Mishra, Abhijit and S., Nivvedan and Bhattacharyya, Pushpak.2014. Measuring Sentiment Annotation Complexity of Text. Association for Computational Linguistics, Baltimore, USA.Mishra, Abhijit and Joshi, Aditya and Bhattacharyya, Pushpak. 2014. A cognitive study of subjectivity
extraction in sentiment annotation. 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA) , ACL, 2014, Baltimore, USAMishra, Abhijit and Singhal, Shubham and Bhattacharyya, Pushpak. 2014. Combining Scanpaths to identify
Common Eye-Movement Strategies in Linguistic-Task Oriented Reading. Under reviewMishra, Abhijit and Bhattacharyya, Pushpak and Carl, Michael. 2013. Automatically Predicting Sentence
Translation Difficulty. Association for Computational Linguistics, Sofia, BulgariaMishra, A and Carl, M and Bhattacharyya, P.2012. A heuristic-based approach for systematic error correction of
gaze data for reading. Proceedings of the First Workshop on Eye-tracking and Natural Language Processing, Mumbai, India. The COLING
Additional collaborative workKunchukuttan Anoop and Mishra, Abhijit and Chatterjee, Rajen and Shah, Ritesh and Bhattacharyya, Pushpak,
Shata-Anuvadak: Tackling Multiway Translation of Indian Languages, LREC 2014, Rekjyavik, Iceland, 26-31 May, 2014
Publications
Publications 33rd Nov, 2014
Abhijit Mishra
The eye-tracking and sentiment database created at IIT Bombay as a part of this work, several scripts for data processing and analysis are released under Creative Commons License.
http://http://www.cfilt.iitb.ac.in/˜cognitive-nlp/The predictive frameworks for Sentiment Annotation Complexity and Translation Complexity computation are made available as web-services at
http://www.cfilt.iitb.ac.in/TCI http://www.cfilt.iitb.ac.in/SAC.
Released Resources and Tools
Resources and tools 43rd Nov, 2014
BACKGROUND
3rd Nov, 2014 5
Abhijit Mishra
NLP state of the art:Machine learning + LinguisticsSupervised/Semi-supervised methods are popular and relatively accurate.
The BIG Picture
The big picture 6
RawData
Annotation Process Data (Gaze patterns, Key-stroke sequence)
Training DataAnnotation MODEL
Raw Data
Prediction(Information
regarding how humans predict)
Features
Labels
3rd Nov, 2014
Abhijit Mishra
Annotation Process Data – A byproduct of Annotation
Background 7
Annotation- It is a task of labelling text, images or other data with comments, explanation, tags or markups.
Example: The movie was good.Part of Speech Annotation: The/DT movie/NN was/VBD good/JJ ./.Translation (to Hindi) : िफ म अ छी थी.Sentiment Annotation: Positive
Annotation involves visualization, comprehension (understanding) and production (producing annotations) requiring “reading” and “typing”Annotation Process Data – Organized representation of such activities.
3rd Nov, 2014
Abhijit Mishra
Data representing reading and writing activities.Gaze data
• Gaze points : Position of eye-gaze on the screen• Fixations : A long stay of the gaze on a particular object on the screen.
Fixations have both Spatial (coordinates) and Temporal (duration) properties.• Saccade : A very rapid movement of eye between the positions of rest.• Scanpath: A path connecting a series of fixations.• Regression: Revisiting a previously read segment
Keystrokes • Insertion: Insertion of a word or character inside running text.• Deletion : Deleting a word or character.• Selection: Highlighting• Rearrangement : Dragging or Copy/Pasting highlighted text.
Other types of data like EEG signals, speech etc.
Annotation Process Data
Background 83rd Nov, 2014
Abhijit Mishra
Human eye movement is poised between perception and cognition.The eye movement pattern during goal oriented reading is driven by
Perceptual properties of the textCognitive processes underlying language processing
Eye-movement is controlled by the “occipital lobe” in the brain. The duration of fixation and the saccadic distance and direction (progression/regression) vary based on the complexity of information to be processed (ref: Neuroergonomics: The brain at work by Parasuraman and Matthew, 2008).
Motivation : “Cognition, Linguistic Complexity and Eye-movement are related.”
Annotation and Eye-tracking
Background 93rd Nov, 2014
Abhijit Mishra
PsycholinguisticsBicknell and Levy (2010) who model eye-movement control of readers. Using Bayesian inference on sentence identity, their model predicts how long to fixate on the current position and where to fixate next.Demberg and Keller (2008) and Boston et al. (2008) relate eye-movement during reading to the underlying syntactic complexity of text.Emotion word processing by Graham G. Scott and Sereno (2012)
Computational LinguisticsKliegl (2011) established a technique to predict word frequency and pattern from eye movements.Doherty et. al (2010) introduced eye-tracking as an automatic Machine Translation Evaluation Technique.Stymne et al. (2012) explained eye-tracking as a tool for Machine Translation (MT) error analysis in which they identified and classified MT errors.Dragsted (2010) observed co-ordination of reading and writing process during translation. Joshi et al. (2011) studied the cognitive aspects of sense annotation process using eye-tracking along with a sense marking tool.
Literature
Background 103rd Nov, 2014
PhD THEME
3rd Nov, 2014 11
Abhijit Mishra
• Using “shallow” cognitive information from eye-tracking data.• Three different scenarios where eye-tracking data can be used
PhD Theme
3rd Nov, 2014 PhD theme 12
Eye-movement data as a form of annotation
ModelingEye-movement
data
Eye movement data to find out strategies employed by humans
for language processing
a. Useful where direct manual annotation is unintuitive and prone to subjectivity
b. Problems addressed: Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis
a. Modeling of eye-movement data for
a. User profiling and user-specific modeling
b. Finding out reading trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye-gaze patterns
Abhijit Mishra
• Using “shallow” cognitive information from eye-tracking data.• Three different scenarios where eye-tracking data can be used
PhD Theme – Part 1
3rd Nov, 2014 PhD theme 13
Eye-movement data as a form of annotation
ModelingEye-movement
data
Eye movement data to find out strategies employed by humans
for language processing
a. Useful where direct manual annotation is unintuitive and prone to subjectivity
b. Problems addressed: Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis
a. Modeling of eye-movement data for
a. User profiling and user-specific modeling
b. Finding out reading trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye-gaze patterns
EYE-MOVEMENT DATA AS ANNOTATION
3rd Nov, 2014 14
Towards measuring and Predicting Annotation Complexity
Abhijit Mishra
Eye- movement data - “Subconscious Annotation”
Objective: To consider this form of annotation for tasks for which manual/direct annotation is “unintuitive” and “subjective”
Example: • Assigning fluency/adequacy scores to a translation, • Giving readability/translatability scores to paragraphs
Using eye-gaze parameters as ‘subconscious annotation’ we propose frameworks to measure and predictTranslation Complexity of textSentiment Annotation Complexity of text
Introduction
3rd Nov, 2014 Annotation Complexity 15
Abhijit Mishra
Translation Complexity Index (TCI): A measure of inherent complexity in text translation.Application:
Categorization of sentence into different level of difficulty.Can be used for better translation cost modeling in a crowdsourcing/outsourcing scenario.Can provide a way of monitoring the progress of second language learners.
Study 1: Predicting Translation Complexity of Text
3rd Nov, 2014 Translation Complexity 16
Length is not a good indicator of translation difficulty.
Example: 1. The camera-man shot the policeman with a gun. (Length = 8)2. I was returning from my old office yesterday. (Length = 8)
Sentence 1 is lexically and structurally ambiguous due to the presence of polysemous word “shot” and the prepositional phrase attachment.
Abhijit Mishra
Translation Complexity Index: Insight
3rd Nov, 2014 Translation Complexity 17
Abhijit Mishra
Framework for Prediction of TCI
3rd Nov, 2014 Translation Complexity 18
Training data Regressor
Labeling through translator’s eye-tracking information Linguistic and Translation
Features
Test Data
TCI
• Direct manual labeling of training examples are fraught with subjectivity. • Labeling by “time for which translation related processing is carried out by the brain”, or “Translation Processing
Time (Tp). Tp is the total fixation and
푇 = 푑푢푟 푓 +∈
푑푢푟 푠 +∈
푑푢푟 푓 +∈
푑푢푟 푠∈
TCImeasured = Tp / sentence_length• TCImeasured is then mapped to a score between 1-10 using MinMax normalization (higher the score,
greated the complexity)
Abhijit Mishra
Lexical FeaturesSentence Length (L)
• Word count• Intuition: Lengthy sentence are more complex to translate
Degree of Polysemy (DP)• Average Senses per word, as per WordNet• Intuition: The more polysemous a word is; the harder it would be to disambiguate the sense
Out of vocabulary measures (OOV) • Percentage of words not present in General Service List (GSL) and Academic Word List (AWL)• Intuition: Words not present in the working vocabulary of the translator would clearly pose challenges to
translation.Average syllables per word (SPW)
• Intuition : SPW is an indicator of readability. Readability relates to translatability. Fraction of Noun, Verb and Preposition (PNJ) - by Trial and ErrorPresence of Digits (DIG) - by Trial and ErrorNamed entity count (NEC) - Selection of features by Trial and Error
Linguistic Features for TCI
3rd Nov, 2014 Translation Complexity 19
Abhijit Mishra
Syntactic FeaturesLin’s Structural Complexity (SC):
• Mean of the total length of the dependency links appearing in the dependency parse tree of a sentence(Lin, 1996)
• SC for the example sentence = 15/7 = 2.14• Intuition: the farther apart syntactically linked elementsare, the harder it would be to parse and comprehend
the sentence.Non-Terminal to Terminal Ratio (NTR):
• The ratio of non-terminals to terminals in the constituency parse-tree of a sentence• Example sentence “It is possible that this is a tough sentence”(ROOT (S (NP (PRP It)) (VP (VBZ is) (ADJP (JJ possible)) (SBAR (IN that) (S (NP (DT this)) (VP (VBZ is) (NP (DT a) (JJ tough) (NN sentence))))))))The Non-terminal to Terminal Ratio is thus 10 / 9 = 1.11.• Intuition: the ratio would be higher for sentences with nested structures which add to syntactic difficulty
and thus translation complexity.
Linguistic Features for TCI (2)
3rd Nov, 2014 Translation Complexity 20
Abhijit Mishra
Semantic FeaturesCo-reference Distance (CRD)
• The sum of distances, in number of words, between all pairs of co-referring text segments in a sentence.
• Example: John and Mary live together but she likes cats while he likes dogs. (CRD = 14)• Intuition: Large portion of text has to be kept in translator’s working memory if CRD is high.
Count of Discourse connectors (DSC) • Discourse connectors are those linking words or phrases that connect multiple discoursesof different semantic content to ring about semantic coherence. (e.g. Although, However etc.)• Intuition: Since the presence of discourse connectors semantically links two discourses, a
translator is required to have the old discourse in active working memory.Passive Clause Count (PCC)
• The number of clauses in passive voice, in a sentence. • Example: The house is guarded by the dog that is taken care of by the home owner. (PCC = 2)• Intuition: Intuitively, it was felt that passive voice is harder to translate than active voice
Linguistic Features for TCI(3)
3rd Nov, 2014 Translation Complexity 21
Abhijit Mishra
Semantic Features (2)Height of hypernymy (HH)
• The path-length to the common hypernymy (parental node) in WordNet. • Example: Adaptation and mitigation efforts must therefore go hand (HH = 4.94)• Intuition: It is indicative of the level of abstractness or specificity of a sentence. Not much
insight• Perplexity (PX)
• Perplexity is the degree of uncertainty of N-grams in a sentences.• A highly perplexed N-gram induces a higher degree of surprise and slows down the process of
comprehension.• For our experiments, we computed trigram perplexity of sentences using language models
trained on a mixture of sentences form Brown corpus, a corpus containing more than one million words.
Linguistic Features for TCI(3)
3rd Nov, 2014 Translation Complexity 22
Abhijit Mishra
Translation Feature:Translation Model Entropy (TME)
• Translation model entropy of a phrase expresses the uncertainty involved in selecting a candidate translation of a source phrase from a set of possible translations. For a source phrase r with T as a set containing all the possible translations with translation probabilities summing to 1the translation entropy H(r) is:
퐻 푠 = − 푃 푡 푠) ∗ log푃 푡 푠∈
• TME of a sentence (according to Kohen et. al.) is computed as follows푇푀퐸 푆푒푛푡 = min
∀퐻 푅
퐻 푅 = 퐻(푟 , 푟 , 푟 , … . . 푟 ) = ∑ 퐻(푟)Here H(R) is the joint entropy of each of the component of R which is the sum of the entropy of each phrase ( by independence assumption) . • Intuitively, if the TME is high, it should take more time even for humans to decide between the
available translation options.
Linguistic Features for TCI(3)
3rd Nov, 2014 Translation Complexity 23
Abhijit Mishra
Correlation: Features and Measured TCI
3rd Nov, 2014 Translation Complexity 24
For 80 sentences extracted from the TPR database.
Abhijit Mishra
Training data consists of 80 examples obtained from the TPR database (Carl 2012). We applied Support Vector Regression (Joachims et al., 1999) with different kernels and “trade-off” parameter C.
Table 1 Table 2 Comparison between Mean Square Error (MSE) between old framework (Table – 1, considering only L,
DP, SC) and new framework (Table- 2 considering all the features)
Experiment and Results
3rd Nov, 2014 Translation Complexity 25
Abhijit Mishra
L DP SC NTR CRDDSC HH NEC DIG OOVPCC PN PV PJ SPW PX TME0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Features
Mea
n S
quar
ed E
rror
Ablation Test
3rd Nov, 2014 Translation Complexity 26
Abhijit Mishra
Observation of correlation between Machine Translation Quality Estimates (METEOR and TER) and TCI. Obtained Google’s translations for 80 sentences. Compared with reference translation from TPR database to compute TER and METEOR scores,
Man vs Machine :: TCI vs MT Quality estimates
3rd Nov, 2014 Translation Complexity 27
TCI not very well correlated, but some features are (can they be used in MT systems) to get better output?
Abhijit Mishra
Central Message: The effort required for a human annotator to detect sentiment is not uniform for all texts. Our metric called “Sentiment Annotation Complexity (SAC)” quantifies this effort, using linguistic properties of text.Example: Just what I wanted: a good pizza versus Just what I wanted: a cold pizzaComplexity in sentiment at
Lexical level Syntactic levelSemantic and pragmatic level
Study 2: Measuring Sentiment Annotation Complexity
3rd Nov, 2014 Sentiment Annotation Complexity 28
It is messy, uncouth,incomprehensible, vicious
andabsurd.
A somewhat crudely constructed but
gripping, questing look at a person
so racked with self-loathing, hebecomes an enemy to his own
race.
It’s like an all-star salute to disney’s
cheesy commercialism.
Abhijit Mishra
Framework for Prediction of SAC
3rd Nov, 2014 Sentiment Annotation Complexity 29
Training data Regressor
Labeling through translator’s eye-tracking information Linguistic and sentiment
features
Test Data
SAC
• TCI like approach.• SAC of a piece of text measured using “Total fixated time” as follows:
Abhijit Mishra
There is no existing database containing eye-tracking information and sentiment annotation (like TPR for translation). We created annotated data as follows:
Five annotators read 566 sentences from movie review dataset and 493 sentences from a tweet dataset (total: 1059)Annotators were asked to give polarities (Positive/Negative/Objective) to each of the textsWhile they annotated, the eye-movement data were recorded using an eye-tracker and Translog-II software
SAC is measured using the previously discussed equation.
Collection of eye-tracking and SA database
3rd Nov, 2014 Sentiment Annotation Complexity 30
Abhijit Mishra
Statistics related to the eye-tracking data
3rd Nov, 2014 Sentiment Annotation Complexity 31
Table 1: Corpus statistics
Table 2: Annotator’s reading speed (Average fixation duration per word)
Figure 1 : Distribution of measured and normalized SAC (between 0-10) 1059 documents. Here each document is a sentence Table 3: Annotation agreement level
Abhijit Mishra
Linguistic Features for SAC
3rd Nov, 2014 Sentiment Annotation Complexity 32
Abhijit Mishra
Correlation: Feature and measured SAC
3rd Nov, 2014 Sentiment Annotation Complexity 33
Abhijit Mishra
Support Vector Regression for predicting SAC using linguistic features.
Prediction framework for SAC
3rd Nov, 2014 Sentiment Annotation Complexity 34
Table 4: Performance of Predictive Framework for 5-fold in-domain and cross-domain validationusing Mean Squared Error (MSE), Mean Absolute Error (MAE) and Mean PercentageError (MPE) estimates and correlation with the gold labels.
Abhijit Mishra
Ablation tests
3rd Nov, 2014 Sentiment Annotation Complexity 35
Abhijit Mishra
Error in parsing: Movie reviews sometimes ungrammatical, tweets mostly ungrammatical. SC and NTR features are flawed due to incorrect parsing.
Example: Just saw the new Night at the Museum movie(Stanford parser unable to resolve PP attachment ambiguity, making it a verb attachment)
Error in Co-reference resolution: Anaphora is not resolved appropriately by Stanford CoreNLPfor most cases. Co-reference distance computation is flawed.
Error in Sentiment Feature ExtractionSentiWordNet used for this purpose. It is automatically annotated and flawed.In order to calculate sentiment features using SentiWordNet, one needs Word Sense Disambiguation (WSD).
Example: I checked the score and it score was 20 love. The word “love” is used here in the sense of “a score of zero in tennis or squash”, whichis the fifth sense for “love” in Wordnet. It is not sentiment bearing (may be taken as a positive word if the first sense it considered).
Error Analysis
3rd Nov, 2014 Sentiment Annotation Complexity 36
Abhijit Mishra
We wanted to check if the confidence scores of a sentiment classifier are negatively correlated with SAC, implying that something which is difficult for humans is also difficult for machines.Three P/N/O classifiers were trained using Naïve Bayes (NB), Maximum Entropy (MaxEnt) and Support Vector Classification (SVC) techniques. Training data – Random 10000 movie reviews from Amazon Corpus and 20000 tweets from twitter corpusFeatures- Presence/absence of Unigram, Bigram, Trigrams + SAC featuresTest data – All 566 movie reviews and 493 tweets from eye-tracking databaseClassifiers’ confidence:
Man vs Machine: SAC vs Classifier Confidence
3rd Nov, 2014 Sentiment Annotation Complexity 37
Abhijit Mishra
Man vs Machine: SAC vs Classifier Confidence
3rd Nov, 2014 Sentiment Annotation Complexity 38
SAC based on human readings is able to capture the difficulty experienced by classifiers as well.
Abhijit Mishra
Goal: If different classifiers can be used to handle text with different levels SAC We categorized all the sentences in the training data into three groups: Easy (SAC 0.1-3), Medium (3.1-7) and Hard (SAC 7.1-10).Accuracies of classifiers (in %) given in the table below.
No conclusion could be drawn from this experiment
SAC and Ensemble Classifiers
3rd Nov, 2014 Sentiment Annotation Complexity 39
Abhijit Mishra
Till now we have considered gaze fixation/saccade duration as a measure of complexity for SAC and TCI.Fixation duration: Still prone to subjectivity and distractionsCan other eye-movement features be exploited?A small experiment was done taking Saccadic distance and Regression count in the SAC formulation
A better way of measuring SAC/TCI
3rd Nov, 2014 Sentiment Annotation Complexity 40
Abhijit Mishra
Qualitative analysis of modified SAC
3rd Nov, 2014 Sentiment Annotation Complexity 41
# Sentence SACold SACnew
1 it is messy , uncouth , incomprehensible , vicious and absurd
3.3 3.1
2 the drama was so uninspiring that even a story immersed in love , lust , and sin couldn’t keep my attention.
3.44 1.06
3 it’s like an all-star salute to disney’s cheesy commercialism.
8.3 10
For many examples scoring using the modified SAC formula gives better ranking of sentences based on complexity.
Abhijit Mishra
Support Vector Regression was applied using the same features and modified SAC labels.Results – More erroneous than the previous
Error analysis, better feature engineering
Predicting SAC with modified SAC labels
3rd Nov, 2014 Sentiment Annotation Complexity 42
Abhijit Mishra
In this section, we Demonstrated how eye-tracking data can be taken as a form of sub-conscious annotation.Applied annotation obtained from eye-tracking to build predictive framework for the prediction of Translation and Sentiment Annotation Complexity of text.
Summary
3rd Nov, 2014 Sentiment Annotation Complexity 43
Abhijit Mishra
• Using “shallow” cognitive information from eye-tracking data.• Three different scenarios where eye-tracking data can be used
PhD Theme – Part 2
3rd Nov, 2014 44
Eye-movement data as a form of annotation
a. Useful where direct manual annotation is unintuitive and prone to subjectivity
b. Problems addressed: Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis
a. Modeling of eye-movement data for
a. User profiling and user-specific modeling
b. Finding out reading trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye-gaze patterns
ModelingEye-movement
data
Eye movement data to find out strategies employed by humans
for language processing
EYE-MOVEMENT DATA FOR COGNITIVE MODELING
3rd Nov, 2014 45
Abhijit Mishra
Text + Eye-gaze information = MultimodalityCan this multimodal data be modeled for
Automatically figuring out the “reading trend” with respect to specific linguistic subtleties ?User profiling and user-specific modeling to find out personal preferences?
Introduction
3rd Nov, 2014 Signature Scanpaths 46
Abhijit Mishra
Eye-tracking studies:Till now gaze-features like fixations, saccades have been used.Scanpaths can be more powerful since they comprise fixations, saccades (in terms of gaze progressions and regressions) [Malsburget.al. 2012]Scanpaths Line graphs:• Fixations as nodes• Saccades are edges
Study 3: Synthesizing “signature scanpaths” that represent consensus eye-movement patterns
3rd Nov, 2014 Signature Scanpaths 47
Abhijit Mishra
Vary from task to task
Nature of scanpaths
3rd Nov, 2014 Signature Scanpaths 48
Reading Surfing Viewing
Abhijit Mishra
Vary with GoalPerson Time
Nature of scanpaths (Reading)
3rd Nov, 2014 Signature Scanpaths 49
Eye-movement trajectories (Scanpaths) for different task oriented reading. Each rows correspond to a scanpaths of a single human subject. Columns (a), (b) and (c) refer scanpaths for tasks of Sentiment Analysis, Summarization and Translation respectively.
Sentence: I hate movie trailers that spoil all the movie also. But Dark City is a great movie, I havent seen it in blue ray yet. But I think it has a very good story, kind of dark (as the title says). My type of story.
Abhijit Mishra
Goal: To a common eye-movement representation or “signature scanpath” from N individual scanpaths corresponding to N readers performing a “task oriented” reading.Utility:
Represents a common behavior. Helps explaining linguistic phenomena in the lights of common perception. Reduction of noise and subjectivity.Can be used to predict the expected eye-movement trajectory during reading• Helpful for UI design• Computational Advertisement
Signature Scanpaths
3rd Nov, 2014 Signature Scanpaths 50
Abhijit Mishra
Expected output
3rd Nov, 2014 51
Figure 3: Combining Scanpaths
Abhijit Mishra
1. Clustering of fixations to find common Regions of Interest based on spatial information (position of the word in terms of cluster count). Each region is assigned with an id called ROI_id
2. Modeling Markov Chains to get the most likely sequence of ROI_id. These sequence will represent the skeleton of the signature scanpath.
3. Compute the duration of each fixation in the final scanpath using interpolation.
Method
3rd Nov, 2014 Signature Scanpaths 52
Abhijit Mishra
One dimentional Clustering of fixations to find Common ROIs. Cursor distance used for distance matrix computation.K-mean clustering algorithm is used for clustering. The value of K changes dynamically as per the length of the considered sentence.The words/phrases in each common ROI is believed to exhibit peculiar characteristics (e.g. elements causing sarcasm)
Clustering to find common region of interest
3rd Nov, 2014 Signature Scanpaths 53
Figure: Clustering of fixations(Features- word positons mapped to Fixations)
Abhijit Mishra
Objective:
푅푂퐼 = 푎푟푔푚푎푥 푃 푅푂퐼 퐹 ∝ 푃 퐹 푅푂퐼 × 푃 푅푂퐼
= ∏ 푃(푓 |푡 ) × 푃(푡 |푡 ) … … … … … … … … … … . 1(Using independence assumptions and Markov assumptions)
Where,푁 → 퐴푣푒푟푎푔푒푛푢푚푏푒푟표푓푓푖푥푎푡푖표푛푠푅푂퐼,퐹 → 푆푒푞푢푒푛푐푒표푓푙푎푏푒푙푠 푡 , 푡 , … . . , 푡 푎푛푑
퐹푖푥푎푡푖표푛푠(푓 , 푓 , … . .푓 )푟푒푠푝푒푐푡푖푣푒푙푦
Note that “ROI set” comprises Cluster IDs from 1,2…..k
3rd Nov, 2014 Signature Scanpaths 54
Applying HMM(1)
Abhijit Mishra
Emission Probabilities P(fi|ti):P(fi|ti) = count (fixations, ti) / Total fixations.
Transition Probabilities P(ti|ti-1)P(ti|ti-1) = count(Transitions from, ti-1to ti) / Total transitions from ti-1 to other ROIs.
Decoding using ViterbiHeuristic: Time based segmentation : after a certain time interval T the transition and emission probabilities are updated.
• To deal with zero probability values, we apply ‘add delta’ smoothing
(Flaw: We have not included valuable information like “distinct users visiting to the same ROI” ,”multiple times visiting the same ROI)
Applying HMM(2)
3rd Nov, 2014 Signature Scanpaths 55
Abhijit Mishra
Experiment Setup
3rd Nov, 2014 Signature Scanpaths 56
• Datasets: Dundee readability corpus, IITB Sentiment Analysis paragraph and sentence level corpora, TPR database translation Post-editing dataset for English-Hindi, English-Spanish• Dataset statistics given below (Ann represented number of readers/annotators)
Table: Statistics of the data used
Abhijit Mishra
ChallengesHard to collect gold standard consensus scanpaths against which the algorithm output can be compared.There is no other existing literature reporting such algorithms against which a comparison can be done
Validation Methods:Quantitative validationQualitative validation
Validation
3rd Nov, 2014 Signature Scanpaths 57
Abhijit Mishra
Validation 1: By Fixation-Word Overlap EstimationWe extract all the words underlying the fixated regions in all individual scanpaths and the signature scanpath.For each word in the signature scanpath, we check what fraction of the population fixate on the same word as expressed by the individual scanpaths. Intuition: “What percentage of the collectively focused regions are reflected in the signature scanpath?
Validation 2: By Measuring Variance in scanpath distanceWe find out the distance of the consensus scanpath from each individual scanpath using Scasim (von der Malsburg et al., 2012). Scasim is a modified edit-distance based similarity measure that considers both spatial(co-ordinates) and and temporal(durations) properties of the fixations to calculate distance between two scanpaths. We compute the pair-wise distance between the consensus scanpath and each individual scanpathand then compute the variance in distance.Intuition: A consensus scanpath is believed to have taken components from every individual scanpath if the variance of the pair-wise distance is low.
Qualitative Validation
3rd Nov, 2014 Signature Scanpaths 58
Abhijit Mishra
Quantitative Validation
3rd Nov, 2014 Signature Scanpaths 59
TABLE: Validation results for different datasets shown in terms of Word overlap (AOIol) andRelative Standard Deviation (Diststdev)
Abhijit Mishra
Modeling does not consider many factors (like individual vs collective eye-transitions, eye-regressions etc.)Algorithms have to be compared against a baselineQualitative Evaluation: Study if variation in signature is caused by the underlying linguistic characteristics.Modeling using random walk theory ?
Scopes for improvements
3rd Nov, 2014 Signature Scanpaths 60
Abhijit Mishra
Topic model: A set of algorithms that discover latent topics in a data set based on observed words.Simple topic models: Latent Dirichlet allocation based modelsRecently started working (in collaboration with Aditya) on
User profiling and user-specific modeling (Incorporating eye-gaze information for term weighting of LDAs. Experiment going on )Utility of such personalized models• Recommendation systems• Readers’ sentiment Analysis• In classroom teaching where different teaching standards can be adopted based
on overlaps in individual topics of interest.
Study 4: User specific Topic Models
3rd Nov, 2014 Topic Models 61
Abhijit Mishra
• Using “shallow” cognitive information from eye-tracking data.• Three different scenarios where eye-tracking data can be used
PhD Theme – Part 3
3rd Nov, 2014 62
Eye-movement data as a form of annotation
a. Useful where direct manual annotation is unintuitive and prone to subjectivity
b. Problems addressed: Predicting Translation Complexity, Sentiment Annotation Complexity
a. Find out strategies employed by humans to tackle linguistic subtleties while solving specific linguistic tasks.
b. Studies done : Subjective Extraction Strategies employed by humans during Sentiment Analysis
a. Modeling of eye-movement data for
a. User profiling and user-specific modeling
b. Finding out reading trends.
b. Problems addressed: User specific Topic Modeling, Generating consensus eye-gaze patterns
ModelingEye-movement
data
Eye movement data to find out strategies employed by humans
for language processing
COGNITIVE STUDY OF SUBJECTIVITY EXTRACTION IN SENTIMENT
ANNOTATION
3rd Nov, 2014 63
Abhijit Mishra
Subjectivity extraction: Extracting subjective (sentiment bearing portions) from a text before predicting the sentiment polarity.Conducted eye-tracking experiments involving humans annotating paragraphs with sentiment labelsTwo different kinds of subjective document
Linear sentiment : Each subjective sentences follow the same sentiment through-outOscillating sentiment: Sentiment flips throughout the document.
Two distinct strategies employed by humans for the task of sentiment analysis-(a) Subjectivity Extraction through Anticipation, where sentences following a series of either positive negative polar sentences are skipped and the sentiment is anticipated without reading the whole document (b)Subjectivity Extraction through Homing, where some portions in the complex documents (with sentiment flips) are rigorously revisited even after a complete pass of reading.
(*A detailed research is being carried out)
Study 5: Subjectivity extraction during sentiment annotation
3rd Nov, 2014 Subjectivity Extraction 64
CONCLUSION AND FUTURE PLANS
3rd Nov, 2014 65
Abhijit Mishra
Three different ways to utilize shallow cognitive information from eye-trackingFocused on annotation complexity- measurement, prediction and applicationProposed and implemented a method to extract signature eye movement patterns to study reading trends. Proposed a to obtain user specific topic modes by “term-weighting” bacic LDA models.Cognitive study explaining “Homing” and “Anticipation” strategies during sentiment annotation.
Conclusion – Work done so far (or ongoing)
3rd Nov, 2014 Conclusions 66
Abhijit Mishra
Towards Generalization of Annotation ComplexityStudy can be applied to other NLP tasks like WSDLeveraging unlabeled data to address data scarcity problem – transductive svr(experiments going on)
Extraction of “signature scanpaths”Better formulation of the synthesis stepProper validation against baseline
User specific topic modelingValidation of the topics using quantitative and qualitative evaluation
Modeling and predicting ‘pause’ placement in social media text. Does pause affect sentiment? Grounding through cognitive studiesPrediction framework for automatic pause insertion.
Future Work
3rd Nov, 2014 Future work 67
THANK YOU
3rd Nov, 2014 68
Abhijit Mishra
Parasuraman, Raja, and Matthew Rizzo. Neuroergonomics: The brain at work. Oxford University Press, Inc., 2008.Klinton Bicknell and Roger Levy. 2010. A rational model of eye movement control in reading. In Proceedings of the 48th annual meeting of the association for computational linguistics, pages 1168–1178. Association for Computational Linguistics.Steven Bird. 2006. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL on Interactive presentation sessions, pages 69–72. Association for Computational Linguistics.David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research, 3:993–1022.Marisa Boston, John Hale, Reinhold Kliegl, Umesh Patil, and Shravan Vasishth.2008. Parsing costs as predictors of reading difficulty: An evaluation using the potsdam sentence corpus. Mind Research RepositoryMichael Carl. 2012a. The CRITT-TPR-DB 1.0: A database for empirical human translation process research. AMTA.Workshop on Post-Editing Technology and PracticeMichael Carl. 2012b. Translog-II: a program for recording user activity data for empirical reading and writing research. In LREC, pages 4108–4112.
References
3rd Nov, 2014 69
Abhijit Mishra
Vera Demberg and Frank Keller. 2008. Data from eye-tracking corpora as evidence for theories of syntactic processing complexity. Cognition, 109(2):193210.Michael Denkowski and Alon Lavie. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 85–91. Association for Computational LinguisticsStephen Doherty, Sharon OBrien, and Michael Carl. 2010. Eye tracking as an MT evaluation technique. Machine translation, 24(1):1–13Barbara Dragsted. 2010. Coordination of reading and writing processes in translation. Translation and cognition, 15:41.Andrea Esuli and Fabrizio Sebastiani. 2006. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, volume 6, pages 417–422.Kar¨en Fort, Adeline Nazarenko, Sophie Rosset, et al. 2012. Modeling the complexity of manual annotation tasks: A grid of analysis. In Proceedings of the International Conference on Computational Linguistics (COLING 2012), pages 895–910.Joseph Goldberg and Jonathan Helfman. 2010. Scanpath clustering and aggregation. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications, pages 227–234. ACM
References
3rd Nov, 2014 70
Abhijit Mishra
Patrick J. O’Donnell Graham G. Scott and Sara C. Sereno.2012. Emotion words affect eye fixations during reading. Journal of Experimental Psychology:Learning, Memory, and Cognition, 38(3):783Thorsten Joachims. 2006. Training linear svms in linear time. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 217–226. ACMSalil Joshi, Diptesh Kanojia, and Pushpak Bhattacharyya. 2013. More than meets the eye: Study of human cognition in sense annotation. In Proceedings of NAACLHLT, pages 733–738Dekang Lin. 1996. On the structural complexity of natural language sentences. In Proceedings of the 16th conference on Computational linguistics-Volume 2, pages 729–733. Association for Computational LinguisticsPascual Martınez-G´omez and Akiko Aizawa. 2013. Diagnosing causes of reading difficulty using bayesian networks. IJCNLP, Japan
References
3rd Nov, 2014 71