CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)
description
Transcript of CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)
![Page 1: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/1.jpg)
CS460/626 : Natural Language Processing/Speech, NLP and the Web
(Lecture 1 – Introduction)
Pushpak BhattacharyyaCSE Dept., IIT Bombay4th Jan, 2011
![Page 2: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/2.jpg)
Persons involved Faculty instructors: Dr. Pushpak
Bhattacharyya (www.cse.iitb.ac.in/~pb)
TAs: Joydip Datta, Debarghya Majumdar {joydip,deb}@cse
Course home page (to be created) www.cse.iitb.ac.in/~cs626-460-2011
![Page 3: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/3.jpg)
Perpectivising NLP: Areas of AI and their inter-dependencies
Search
Vision
PlanningMachine Learning
Knowledge RepresentationLogic
Expert SystemsRoboticsNLP
![Page 4: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/4.jpg)
Books etc. Main Text(s):
Natural Language Understanding: James Allan Speech and NLP: Jurafsky and Martin Foundations of Statistical NLP: Manning and Schutze
Other References: NLP a Paninian Perspective: Bharati, Cahitanya and Sangal Statistical NLP: Charniak
Journals Computational Linguistics, Natural Language Engineering, AI, AI
Magazine, IEEE SMC Conferences
ACL, EACL, COLING, MT Summit, EMNLP, IJCNLP, HLT, ICON, SIGIR, WWW, ICML, ECML
![Page 5: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/5.jpg)
Allied DisciplinesPhilosophy Semantics, Meaning of “meaning”, Logic
(syllogism)Linguistics Study of Syntax, Lexicon, Lexical Semantics etc.
Probability and Statistics Corpus Linguistics, Testing of Hypotheses, System Evaluation
Cognitive Science Computational Models of Language Processing, Language Acquisition
Psychology Behavioristic insights into Language Processing, Psychological Models
Brain Science Language Processing Areas in Brain
Physics Information Theory, Entropy, Random Fields
Computer Sc. & Engg. Systems for NLP
![Page 6: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/6.jpg)
Topics proposed to be covered
Shallow Processing Part of Speech Tagging and Chunking using HMM, MEMM, CRF, and
Rule Based Systems EM Algorithm
Language Modeling N-grams Probabilistic CFGs
Basic Speech Processing Phonology and Phonetics Statistical Approach Automatic Speech Recognition and Speech Synthesis
Deep Parsing Classical Approaches: Top-Down, Bottom-UP and Hybrid Methods Chart Parsing, Earley Parsing Statistical Approach: Probabilistic Parsing, Tree Bank Corpora
![Page 7: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/7.jpg)
Topics proposed to be covered (contd.) Knowledge Representation and NLP
Predicate Calculus, Semantic Net, Frames, Conceptual Dependency, Universal Networking Language (UNL)
Lexical Semantics Lexicons, Lexical Networks and Ontology Word Sense Disambiguation
Applications Machine Translation IR Summarization Question Answering
![Page 8: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/8.jpg)
Grading Based on
Midsem Endsem Assignments Paper-reading/SeminarExcept the first two everything else in
groups of 4. Weightages will be revealed soon.
![Page 9: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/9.jpg)
Definitions etc.
![Page 10: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/10.jpg)
What is NLP Branch of AI 2 Goals
Science Goal: Understand the way language operates
Engineering Goal: Build systems that analyse and generate language; reduce the man machine gap
![Page 11: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/11.jpg)
The famous Turing Test: Language Based Interaction
Machine
Human
Test conductor
Can the test conductor find out which is the machine and which the human
![Page 12: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/12.jpg)
Inspired Eliza http://www.manifestation.com/
neurotoys/eliza.php3
![Page 13: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/13.jpg)
Inspired Eliza (another sample interaction)
A Sample of Interaction:
![Page 14: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/14.jpg)
“What is it” question: NLP is concerned with Grounding
Ground the language into perceptual, motor and cognitive capacities.
![Page 15: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/15.jpg)
GroundingChair
Computer
![Page 16: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/16.jpg)
Two Views of NLP and the Associated Challenges
1. Classical View2. Statistical/Machine
Learning View
![Page 17: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/17.jpg)
Stages of processing Phonetics and phonology Morphology Lexical Analysis Syntactic Analysis Semantic Analysis Pragmatics Discourse
![Page 18: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/18.jpg)
Phonetics Processing of speech Challenges
Homophones: bank (finance) vs. bank (river bank) Near Homophones: maatraa vs. maatra (hin) Word Boundary
aajaayenge (aa jaayenge (will come) or aaj aayenge (will come today)
I got [ua]plate Phrase boundary
mtech1 students are especially exhorted to attend as such seminars are integral to one's post-graduate education
Disfluency: ah, um, ahem etc.
![Page 19: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/19.jpg)
Morphology Word formation rules from root words Nouns: Plural (boy-boys); Gender marking (czar-czarina) Verbs: Tense (stretch-stretched); Aspect (e.g. perfective
sit-had sat); Modality (e.g. request khaanaa khaaiie) First crucial first step in NLP Languages rich in morphology: e.g., Dravidian,
Hungarian, Turkish Languages poor in morphology: Chinese, English Languages with rich morphology have the advantage of
easier processing at higher stages of processing A task of interest to computer science: Finite State
Machines for Word Morphology
![Page 20: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/20.jpg)
Lexical Analysis Essentially refers to dictionary access
and obtaining the properties of the worde.g. dog
noun (lexical property)take-’s’-in-plural (morph
property)animate (semantic property)4-legged (-do-)carnivore (-do)
Challenge: Lexical or word sense disambiguation
![Page 21: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/21.jpg)
Lexical DisambiguationFirst step: part of Speech Disambiguation
Dog as a noun (animal) Dog as a verb (to pursue)
Sense Disambiguation Dog (as animal) Dog (as a very detestable person)
Needs word relationships in a context The chair emphasised the need for adult
educationVery common in day to day communicationsSatellite Channel Ad: Watch what you want, when
you want (two senses of watch)e.g., Ground breaking ceremony/research
![Page 22: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/22.jpg)
Technological developments bring in new terms, additional meanings/nuances for existing terms
Justify as in justify the right margin (word processing context)
Xeroxed: a new verb Digital Trace: a new expression Communifaking: pretending to talk on
mobile when you are actually not Discomgooglation: anxiety/discomfort
at not being able to access internet Helicopter Parenting: over parenting
![Page 23: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/23.jpg)
Syntax Processing StageStructure Detection
S
NPVP
V NP
Ilike mangoes
![Page 24: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/24.jpg)
Parsing Strategy Driven by grammar
S-> NP VP NP-> N | PRON VP-> V NP | V PP N-> Mangoes PRON-> I V-> like
![Page 25: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/25.jpg)
Challenges in Syntactic Processing: Structural Ambiguity Scope
1.The old men and women were taken to safe locations
(old men and women) vs. ((old men) and women)2. No smoking areas will allow Hookas inside
Preposition Phrase Attachment I saw the boy with a telescope (who has the telescope?) I saw the mountain with a telescope (world knowledge: mountain cannot be an
instrument of seeing) I saw the boy with the pony-tail (world knowledge: pony-tail cannot be an
instrument of seeing)Very ubiquitous: newspaper headline “20 years
later, BMC pays father 20 lakhs for causing son’s death”
![Page 26: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/26.jpg)
Structural Ambiguity… Overheard
I did not know my PDA had a phone for 3 months An actual sentence in the newspaper
The camera man shot the man with the gun when he was near Tendulkar
(P.G. Wodehouse, Ring in Jeeves) Jill had rubbed ointment on Mike the Irish Terrier, taken a look at the goldfish belonging to the cook, which had caused anxiety in the kitchen by refusing its ant’s eggs…
(Times of India, 26/2/08) Aid for kins of cops killed in terrorist attacks
![Page 27: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/27.jpg)
Headache for Parsing: Garden Path sentences Garden Pathing
The horse raced past the garden fell. The old man the boat. Twin Bomb Strike in Baghdad kill 25
(Times of India 05/09/07)
![Page 28: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/28.jpg)
Semantic Analysis Representation in terms of
Predicate calculus/Semantic Nets/Frames/Conceptual Dependencies and Scripts
John gave a book to Mary Give action: Agent: John, Object: Book,
Recipient: Mary Challenge: ambiguity in semantic role labeling
(Eng) Visiting aunts can be a nuisance (Hin) aapko mujhe mithaai khilaanii padegii
(ambiguous in Marathi and Bengali too; not in Dravidian languages)
![Page 29: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/29.jpg)
Pragmatics Very hard problem Model user intention
Tourist (in a hurry, checking out of the hotel, motioning to the service boy): Boy, go upstairs and see if my sandals are under the divan. Do not be late. I just have 15 minutes to catch the train.
Boy (running upstairs and coming back panting): yes sir, they are there.
World knowledge WHY INDIA NEEDS A SECOND OCTOBER (ToI,
2/10/07)
![Page 30: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/30.jpg)
DiscourseProcessing of sequence of sentences Mother to John:
John go to school. It is open today. Should you bunk? Father will be very angry.
Ambiguity of openbunk what?Why will the father be angry?
Complex chain of reasoning and application of world knowledge Ambiguity of father
father as parent or
father as headmaster
![Page 31: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/31.jpg)
Complexity of Connected Text
John was returning from school dejected – today was the math test
He couldn’t control the class
Teacher shouldn’t have made him responsible
After all he is just a janitor
![Page 32: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/32.jpg)
A look at Textual Humour1. Teacher (angrily): did you miss the class
yesterday?Student: not much
2. A man coming back to his parked car sees the sticker "Parking fine". He goes and thanks the policeman for appreciating his parking skill.
3. Son: mother, I broke the neighbour's lamp shade.Mother: then we have to give them a new one.Son: no need, aunty said the lamp shade is irreplaceable.
4. Ram: I got a Jaguar car for my unemployed youngest son.Shyam: That's a great exchange!
5. Shane Warne should bowl maiden overs, instead of bowling maidens over
![Page 33: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/33.jpg)
Giving a flavour of what is done in NLP: Structure Disambiguation
Scope, Clause and Preposition/Postpositon
![Page 34: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/34.jpg)
Structure Disambiguation is as critical as Sense Disambiguation Scope (portion of text in the scope of a
modifier) Old men and women will be taken to safe
locations No smoking areas allow hookas inside
Clause I told the child that I liked that he came to
the game on time Preposition
I saw the boy with a telescope
![Page 35: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/35.jpg)
Structure Disambiguation is as critical as Sense Disambiguation (contd.) Semantic role
Visiting aunts can be a nuisance Mujhe aapko mithaai khilaani padegii (“I have to give you
sweets” or “You have to give me sweets”) Postposition
unhone teji se bhaaagte hue chor ko pakad liyaa (“he caught the thief that was running fast” or “he ran fast and caught the thief”)
All these ambiguities lead to the construction of multiple parse trees for each sentence and need semantic, pragmatic and discourse cues for disambiguation
![Page 36: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/36.jpg)
Higher level knowledge needed for disambiguation Semantics
I saw the boy with a pony tail (pony tail cannot be an instrument of seeing)
Pragmatics ((old men) and women) as opposed to (old
men and women) in “Old men and women were taken to safe location”, since women- both and young and old- were very likely taken to safe locations
Discourse: No smoking areas allow hookas inside,
except the one in Hotel Grand. No smoking areas allow hookas inside, but
not cigars.
![Page 37: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/37.jpg)
Preposition Attachment Disambiguation
![Page 38: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/38.jpg)
Problem definition 4-tuples of the form V N1 P N2
saw (V) boys (N1) with (P) telescopes (N2)
Attachment choice is between the matrix verb V and the object noun N1
![Page 39: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/39.jpg)
Lexical Association Table (Hindle and Rooth, 1991 and 1993)
From a large corpus of parsed text first find all noun phrase heads then record the verb (if any) that
precedes the head and the preposition (if any) that
follows it as well as some other syntactic
information about the sentence. Extract attachment information
from this table of co-occurrences
![Page 40: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/40.jpg)
Example: lexical association
A table entry is considered a definite instance of the prepositional phrase attaching to the verb if: the verb definitely licenses the prepositional
phrase E.g. from Propbank,
absolve frames absolve.XX: NP-ARG0 NP-ARG2-of obj-ARG1 1 absolve.XX NP-ARG0 NP-ARG2-of obj-ARG1 On Friday , the firms filed a suit *ICH*-1
against West Virginia in New York state court asking for [ARG0 a declaratory judgment] [rel absolving] [ARG1 them] of [ARG2-of liability] .
![Page 41: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/41.jpg)
Core steps
Seven different procedures for deciding whether a table entry is an instance of no attachment, sure noun attach, sure verb attach, or ambiguous attach
able to extract frequency information, counting the number of times a particular verb or noun attaches with a particular preposition
![Page 42: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/42.jpg)
Core steps (contd.)
These frequencies serve as the training data for the statistical model used to predict correct attachment
To disambiguate a sentence, compute the likelihood of the particular preposition given the particular verb and contrast with the likelihood of the preposition given the particular noun i.e., compare P(with|saw) with P(with|telescope)
as in I saw the boy with a telescope
![Page 43: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/43.jpg)
Critique Limited by the number of
relationships in the training corpora Too large a parameter space Model acquired during training is
represented in a huge table of probabilities, precluding any straightforward analysis of its workings
![Page 44: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/44.jpg)
Approach based on Transformation Based Error Driven Learning, Brill and Resnick, COLING 1994
![Page 45: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/45.jpg)
Example Transformations
Initial attach-ments by defaultare to N1 pre-dominantly.
![Page 46: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/46.jpg)
Transformation rules with word classes
Wordnet synsetsandSemantic classes used
![Page 47: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/47.jpg)
Accuracy values of the transformation based approach: 12000 training and 500 test examples
Method Accuracy #of transformation rules
Hindle and Rooth(baseline)
70.4 to 75.8% NA
Transformations 79.2 418
Transformations(word classes)
81.8 266
![Page 48: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/48.jpg)
Maximum Entropy Based Approach: (Ratnaparki, Reyner, Roukos, 1994) Use more features than (V N1)
bigram and (N1 P) bigram Apply Maximum Entropy Principle
![Page 49: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/49.jpg)
Core formulation We denote
the partially parsed verb phrase, i.e., the verb phrase without the attachment decision, as a history h, and
the conditional probability of an attachment as P(d|h),
where d and corresponds to a noun or verb attachment- 0 or 1- respectively.
![Page 50: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/50.jpg)
Maximize the training data log likelihood
--(1)
--(2)
![Page 51: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/51.jpg)
Equating the model expected parameters and training data parameters
--(3)
--(4)
![Page 52: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/52.jpg)
Features Two types of binary-valued
questions: Questions about the presence of
any n-gram of the four head words, e.g., a bigram maybe V == ‘‘is’’, P == ‘‘of’’
Features comprised solely of questions on words are denoted as “word” features
![Page 53: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/53.jpg)
Features (contd.) Questions that involve the class
membership of a head word Binary hierarchy of classes derived
by mutual information
![Page 54: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/54.jpg)
Features (contd.) Given a binary class hierarchy,
we can associate a bit string with every word in the vocabulary
Then, by querying the value of certain bit positions we can construct
binary questions. Features comprised solely of
questions about class bits are denoted as “class” features, and features containing questions about both class bits and words are denoted as “mixed” features.
![Page 55: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/55.jpg)
Word classes (Brown et. al. 1992)
![Page 56: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/56.jpg)
Experimental data size
![Page 57: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/57.jpg)
Performance of ME Model on Test Events
![Page 58: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/58.jpg)
Examples of Features Chosen for Wall St. Journal Data
![Page 59: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/59.jpg)
Average Performance of Human & ME Model on300 Events of WSJ Data
![Page 60: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/60.jpg)
Human and ME model performance on consensus set for WSJ
![Page 61: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/61.jpg)
Average Performance of Human & ME Model on200 Events of Computer Manuals Data
![Page 62: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/62.jpg)
Back-off model based approach (Collins and Brooks, 1995) NP-attach:
(joined ((the board) (as a non executive director)))
VP-attach: ((joined (the board)) (as a non executive
director)) Correspondingly, NP-attach:
1 joined board as director VP-attach:
0 joined board as director Quintuple of (attachment: A: 0/1, V, N1, P, N2) 5 random variables
![Page 63: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/63.jpg)
Probabilistic formulation
Or briefly,
If
Then the attachment is to the noun, else to the verb
![Page 64: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/64.jpg)
Maximum Likelihood estimate
![Page 65: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/65.jpg)
The Back-off estimateo Inspired by speech recognitiono Prediction of the Nth word from previous (N-1) words
Data sparsity problemf(w1, w2, w3,…wn) will frequently be 0 for large values on n
![Page 66: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/66.jpg)
Back-off estimate contd.
The cut off frequencies (c1, c2 ....) are thresholds determining whether to back-off or not at each level- counts lower than ci at stage i are deemed to be too low to give an accurate estimate, so in this case backing-off continues.
![Page 67: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/67.jpg)
Back off for PPT attachment
Note: the back off tuples always retain the preposition
![Page 68: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/68.jpg)
The backoff algorithm
![Page 69: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/69.jpg)
Lower and upper bounds on performance
Lower bound(most frequent) Upper bound
(human expertsLooking at 4 wordonly)
![Page 70: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/70.jpg)
Results
![Page 71: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/71.jpg)
Comparison with other systems
Maxent, Ratnaparkhi et. al.
TransformationLearning,Brill et. al.
![Page 72: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/72.jpg)
Flexible Unsupervised PP Attachment using WSD and Data Sparsity Reduction: (Medimi Srinivas and Pushpak Bhattacharyya, IJCAI 2007)
Unsupervised approach (some way similar to Ratnaparkhi 1998): The training data is extracted from raw text
The unambiguous training data of the form V-P-N and N1-P-N2 TEACH the system how to resolve PP-attachment in ambiguous test data V-N1-P-N2
Refinement of extracted training data. And use of N2 in PP-attachment resolution process.
![Page 73: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/73.jpg)
Flexible Unsupervised PP Attachment using WSD and Data Sparsity Reduction: (Medimi Srinivas and Pushpak Bhattacharyya, IJCAI 2007)
PP-attachment is determined by the semantic property of lexical items in the context of preposition using WordNet
An Iterative Graph based unsupervised approach is used for Word Sense disambiguation (Similar to Mihalcea 2005)
Use of a Data sparseness Reduction (DSR) Process which uses lemmatization, Synset replacement and a form of inferencing. DSRP uses WordNet.
Flexible use of WSD and DSR processes for PP-Attachment
![Page 74: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/74.jpg)
Graph based disambiguation: page rank based algorithm, Mihalcea 2005
![Page 75: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/75.jpg)
Experimental setup Training Data:
Brown corpus (raw text). Corpus size is 6 MB, consists of 51763 sentences, nearly 1 million 27 thousand words.
Most frequent Prepositions in the syntactic context N1-P-N2: of, in, for, to, with, on, at, from, by
Most frequent Prepositions in the syntactic context V-P-N: in, to, by, with, on, for, from, at, of
The Extracted unambiguous N1-P-N2: 54030 and V-P-N: 22362
Test Data: Penn Treebank Wall Street Journal (WSJ) data
extracted by Ratnaparkhi It consists of V-N1-P-N2 tuples: 20801(training),
4039(development) and 3097(Test)
![Page 76: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/76.jpg)
Experimental setup contd. BaseLine:
The unsupervised approach by Ratnaparkhi, 1998 (Base-RP).
Preprocessing: Upper case to lower case Any four digit number less than 2100 as a
year Any other number or % signs are converted
to num Experiments are performed using DSRP: with
different stages of DSRP Experiments are performed using GuWSD
and DSRP: with different senses
![Page 77: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/77.jpg)
The process of extracting training data: Data Sparsity Reduction
Tools/process Output
Raw Text The professional conduct of the doctors is guided by Indian Medical Association.
POS Tagger The_DT professional_JJ conduct_NN of_IN the_DT doctors_NNS is_VBZ guided_VBN by_ IN Indian_NNP Medical_NNP Association_NNP._.
Chunker [The_DT professional_JJ conduct_NN ] of_IN [the_DT doctors_NNS ] (is_VBZ guided_VBN) by_IN [Indian_NNP Medical_NNP Association_NNP].After replacing each chunk by its head word it results in: conduct_NN of_IN doctors_NNS guided_VBN by_IN Association_NNP
Extraction Heuristics
N1PN2: conduct of doctors and VPN: guided by Association
Morphing N1PN2: conduct of doctor and VPN: guide by association
DSRP (Synset Replacement)
N1PN2: {conduct, behavior} of {doctor, physician} can result in 4 combination with the same sense and similarly for VPN: {guide, direct} by {association} can result in 2 combinations with the same sense.
![Page 78: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/78.jpg)
Data Sparsity Reduction: Inferencing
If V1-P-N1 and V2-P-N1 exist as also do
V1-P- N2 and V2-P-N2, then if
V3-P-Ni exist (i=1,2), thenwe can infer the existence of
V3-P-NJ (i ≠ j) with a frequency count of V3-P-Ni that can be added to the corpus.
![Page 79: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/79.jpg)
Example of DSR by inferencing V1-P-N1: play in garden and V2-P-N1:
sit in garden V1-P-N2: play in house and V2-P-N2:
sit in house V3-P-N2: jump in house exists Infer the existence of V3-P-N1: jump in garden
![Page 80: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/80.jpg)
Results
![Page 81: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/81.jpg)
Effect of various processes on FlexPPAttach algorithm
![Page 82: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 – Introduction)](https://reader034.fdocuments.in/reader034/viewer/2022051118/568165a2550346895dd88160/html5/thumbnails/82.jpg)
Precision vs. various processes