CS 479, section 1: Natural Language Processing
description
Transcript of CS 479, section 1: Natural Language Processing
![Page 1: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/1.jpg)
Thanks to Dan Klein of UC Berkeley for many of the materials used in this lecture.
CS 479, section 1:Natural Language Processing
Lecture #22: Part of Speech Tagging, Hidden Markov Models
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.
![Page 2: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/2.jpg)
Announcements Reading Report #9: Hidden Markov Models
Due: now
Project #2, Part 1 loose ends Resolve issues raised by TA and respond promptly
Project #2, Part 2 Early: Friday Due: Monday
Mid-course Evaluation Your feedback is important
Colloquium by Dr. Jordan Boyd-Graber Thursday at 11am
![Page 3: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/3.jpg)
Objectives New general problem: labeling sequences
First Application: Part-of-Speech Tagging
Introduce first technique: Hidden Markov Models (HMMs)
![Page 4: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/4.jpg)
Parts-of-Speech Syntactic classes of words – where do they come from?
Useful distinctions vary from language to language Tag-sets even vary from corpus to corpus [See M&S p. 142]
Some tags from the Penn tag-set:
CD numeral, cardinal mid-1890 nine-thirty 0.5 oneDT determiner a all an every no that theIN preposition or conjunction, subordinating among whether out on by ifJJ adjective or numeral, ordinal third ill-mannered regrettableMD modal auxiliary can may might will would NN noun, common, singular or mass cabbage thermostat investment subhumanity
NNP noun, proper, singular Motown Cougar Yvette LiverpoolPRP pronoun, personal hers himself it we themRB adverb occasionally maddeningly adventurouslyRP particle aboard away back by on open throughVB verb, base form ask bring fire see take
VBD verb, past tense pleaded swiped registered sawVBN verb, past participle dilapidated imitated reunifed unsettledVBP verb, present tense, not 3rd person singular twist appear comprise mold postpone
![Page 5: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/5.jpg)
CC conjunction, coordinating and both but either orCD numeral, cardinal mid-1890 nine-thirty 0.5 oneDT determiner a all an every no that theEX existential there there FW foreign word gemeinschaft hund ich jeuxIN preposition or conjunction, subordinating among whether out on by ifJJ adjective or numeral, ordinal third ill-mannered regrettable
JJR adjective, comparative braver cheaper tallerJJS adjective, superlative bravest cheapest tallestMD modal auxiliary can may might will would NN noun, common, singular or mass cabbage thermostat investment subhumanity
NNP noun, proper, singular Motown Cougar Yvette LiverpoolNNPS noun, proper, plural Americans Materials StatesNNS noun, common, plural undergraduates bric-a-brac averagesPOS genitive marker ' 's PRP pronoun, personal hers himself it we them
PRP$ pronoun, possessive her his mine my our ours their thy your RB adverb occasionally maddeningly adventurously
RBR adverb, comparative further gloomier heavier less-perfectlyRBS adverb, superlative best biggest nearest worst RP particle aboard away back by on open throughTO "to" as preposition or infinitive marker to UH interjection huh howdy uh whammo shucks heckVB verb, base form ask bring fire see take
VBD verb, past tense pleaded swiped registered sawVBG verb, present participle or gerund stirring focusing approaching erasingVBN verb, past participle dilapidated imitated reunified unsettledVBP verb, present tense, not 3rd person singular twist appear comprise mold postponeVBZ verb, present tense, 3rd person singular bases reconstructs marks usesWDT WH-determiner that what whatever which whichever WP WH-pronoun that what whatever which who whom
WP$ WH-pronoun, possessive whose WRB Wh-adverb however whenever where why
![Page 6: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/6.jpg)
Part-of-Speech Ambiguity
Favorite example:
Fed raises interest rates 0.5 percentNNP NNS NN NNS CD NNVBN VBZ VBP VBZVBD VB
![Page 7: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/7.jpg)
Why POS Tagging? Text-to-speech:
record[v] vs. record[n] lead[v] vs. lead[n] object[v] vs. object[n]
Lemmatization: saw[v] see saw[n] saw
Quick-and-dirty NP-chunk detection: grep {JJ | NN}* {NN | NNS}
![Page 8: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/8.jpg)
Why POS Tagging?
Useful as a pre-processing step for parsing Less tag ambiguity means fewer parses However, some tag choices are better decided by
parsers!
DT NN IN NN VBD NNS VBDThe average of interbank offered rates plummeted …
DT NNP NN VBD VBN RP NN NNSThe Georgia branch had taken on loan commitments …
IN
VBN
![Page 9: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/9.jpg)
Part-of-Speech Ambiguity
Back to our example:
What information sources would help? Two basic sources of constraint:
Grammatical environment Identity of the current word
Many more possible features: … but we won’t be able to use them just yet
Fed raises interest rates 0.5 percentNNP NNS NN NNS CD NNVBN VBZ VBP VBZVBD VB
![Page 10: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/10.jpg)
How? Recall our two basic sources of information:
Grammatical environment Identity of the current word
How can we use these insights in a joint model? previous tag tag own tag word – remember, think generative!
What would that look like?
![Page 11: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/11.jpg)
Hidden Markov Model (HMM)
![Page 12: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/12.jpg)
“HMM”?
![Page 13: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/13.jpg)
Hidden Markov Model (HMM) A generative model over tag sequences and
observations Assume: Tag sequence is generated by an order Markov
chain Assume: Words are chosen independently, conditioned
only on the tag E.g., order 2:
Need two “local models”: Transitions: Emissions:
![Page 14: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/14.jpg)
Parameter Estimation Transition model:
Use standard smoothing methods to estimate transition scores, e.g.:
Emission model:
Trickier. What about … Words we’ve never seen before Words which occur with tags we’ve never seen
One option: break out the Good-Turing smoothing But words aren’t black boxes:
343,127.23 11-year Minteria reintroducible
![Page 15: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/15.jpg)
Disambiguation Tagging is disambiguation: Finding the best tag sequence
Roughly, think of this as sequential classification, where the choice also depends on the uncertain decision made in the previous step.
Given an HMM (i.e., distributions for transitions and emissions), we can score any word sequence and tag sequence:
In principle, we’re done! We have a tagger: We could enumerate all possible tag sequences Score them all Pick the best one
Fed raises interest rates 0.5 percent . NNP VBZ NN NNS CD NN . STOP
𝑃 ( �́� ,�́� )=𝑃 (𝑁𝑁𝑃|, )⋅ 𝑃 (𝐹𝑒𝑑|𝑁𝑁𝑃 )⋅ 𝑃 (𝑉𝐵𝑍|,𝑁𝑁𝑃 ) ⋅ 𝑃 (𝑟𝑎𝑖𝑠𝑒𝑠|𝑉𝐵𝑍 ) ⋅ 𝑃 (𝑁𝑁|𝑁𝑁𝑃 ,𝑉𝐵𝑍 ) ⋅ 𝑃 (𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡|𝑁𝑁 ) ⋅…
![Page 16: CS 479, section 1: Natural Language Processing](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816021550346895dcf23f4/html5/thumbnails/16.jpg)
Next Efficient use of Hidden Markov Models for POS
tagging: the Viterbi algorithm