TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM...
Transcript of TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM...
![Page 1: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/1.jpg)
TTIC31210:AdvancedNaturalLanguageProcessing
KevinGimpelSpring2017
Lecture14:FinishupBayesian/UnsupervisedNLP,
StartStructuredPrediction
1
![Page 2: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/2.jpg)
• TodayandWednesday:structuredprediction• NoclassMondayMay29(MemorialDay)• FinalclassisWednesdayMay31
2
![Page 3: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/3.jpg)
• Assignment3hasbeenposted,dueThursdayJune1• FinalprojectreportdueFriday,June9
3
![Page 4: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/4.jpg)
KeyQuantities
4
Ourdataisasetofsamples:
![Page 5: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/5.jpg)
GibbsSamplingTemplate
5
![Page 6: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/6.jpg)
LDA
6
![Page 7: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/7.jpg)
ExpectationMaximization(EM)
• EMisanalgorithmictemplatethatfindsalocalmaximumofthemarginallikelihoodoftheobserveddata
7
![Page 8: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/8.jpg)
EM• “E”step:– computeposteriorsoverlatentvariables:
• “M”step:– updateparametersgivenposteriors:
8
![Page 9: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/9.jpg)
DifferentViewsoftheDirichlet Process(DP)
• lasttimewediscussedthe“stick-breaking”viewoftheDP
• todaywe’llbrieflydiscussthe“ChineseRestaurantProcess”view
• withbothviews,westillhavethesameDPhyperparameters(basedistribution&concentrationparameter)
9
![Page 10: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/10.jpg)
BaseDistributionforDP
• ourunboundeddistributionoveritemswillchoosethemfromthebasedistribution
• basedistributionusuallyhasinfinitesupport• simpleexamplebasedistributionforourmorphlexicon:
10
![Page 11: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/11.jpg)
ConcentrationParameter• instick-breakingprocess,concentrationparameterdetermineshowmuchofthestickwebreakoffeachtime
• highconcentration==smallpartsofstick
11
fullstick
![Page 12: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/12.jpg)
• thestick-breakingconstructionoftheDPisusefulforspecifyingmodelsanddefininginferencealgorithms
• anotherusefulwayofrepresentingadrawfromaDPiswiththeChineseRestaurantProcess(CRP)– CRPprovidesadistributionoverpartitionswithanunboundednumberofparts
12
![Page 13: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/13.jpg)
• imagineaChineserestaurantwithaninfinitenumberoftables…
13
…
1 2 3 4
![Page 14: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/14.jpg)
• firstcustomersitsatfirsttable:
14
…
1 2 3 4
![Page 15: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/15.jpg)
• secondcustomerenters,choosesatable:
15
…
1 2 3 4
![Page 16: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/16.jpg)
• secondcustomerenters,
choosestable1:
16
…
1 2 3 4
![Page 17: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/17.jpg)
• secondcustomerenters,
choosestable1:
choosesnewtable:
17
…
1 2 3 4
![Page 18: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/18.jpg)
• secondcustomerenters,
choosestable1
18
…
1 2 3 4
![Page 19: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/19.jpg)
• thirdcustomerenters,
19
…
1 2 3 4
![Page 20: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/20.jpg)
• thirdcustomerenters,
choosestable1:
choosesnewtable:
20
…
1 2 3 4
![Page 21: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/21.jpg)
• thirdcustomerenters,
choosesnewtable
21
…
1 2 3 4
![Page 22: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/22.jpg)
• fourthcustomerenters,
p(choosetable1):p(choosetable2):
p(choosenewtable):
22
…
1 2 3 4
![Page 23: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/23.jpg)
23
…
1 2 3 4
![Page 24: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/24.jpg)
• largevalueofconcentrationparameter:
24
…
1 2 3 4
fullstick
![Page 25: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/25.jpg)
• smallvalueofconcentrationparameter:
25
…
1 2 3 4
![Page 26: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/26.jpg)
ADrawGfromaDP(Stick-BreakingRepresentation)
26
drawinfiniteprobabilitiesfromstick-breakingprocesswithparameters
drawatomsfrombasedistributionatomscanberepeated!
![Page 27: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/27.jpg)
ARepresentationofGDrawnfromaDP(ChineseRestaurantProcessRepresentation)
27
drawtableassignmentsforn customerswithparameters
foreachoccupiedtable,drawatomfrombasedistribution
numberoftablesoccupied
eachdrawfromGisanatom,whereitsprobabilitycomesfromthenumberofcustomersatitstable
![Page 28: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/28.jpg)
WhentobeBayesian?• ifyou’redoingunsupervisedlearningorlearningwithlatentvariables
• ifyouwanttomarginalizeoutsomemodelparameters
• ifyouwanttolearnthestructure/architectureofyourmodel
• ifyouwanttolearnapotentially-unboundedlexicon(Bayesiannonparametrics)
28
![Page 29: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/29.jpg)
WhatisStructuredPrediction?
29
![Page 30: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/30.jpg)
Modeling,Inference,Learning
30
![Page 31: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/31.jpg)
Modeling,Inference,Learning
• Modeling:Howdoweassignascoretoan(x,y)pairusingparameters?
modeling:definescorefunction
31
![Page 32: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/32.jpg)
Modeling,Inference,Learning
• Inference:Howdoweefficientlysearchoverthespaceofalllabels?
inference:solve_ modeling:definescorefunction
32
![Page 33: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/33.jpg)
Modeling,Inference,Learning
• Learning:Howdowechoose?
learning:choose_
modeling:definescorefunctioninference:solve_
33
![Page 34: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/34.jpg)
Modeling,Inference,Learning
StructuredPrediction:sizeofoutputspaceisexponentialinsizeofinputorisunbounded(e.g.,machinetranslation)(wecan’tjustenumerateallpossibleoutputs)
learning:choose_
modeling:definescorefunctioninference:solve_
34
![Page 35: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/35.jpg)
determinerverb(past)prep.properproperposs.adj.noun
modalverbdet.adjectivenounprep.properpunc.
35
Part-of-SpeechTagging
determinerverb(past)prep.nounnounposs.adj.nounSomequestionedifTimCook’sfirstproduct
modalverbdet.adjectivenounprep.nounpunc.wouldbeabreakawayhitforApple.
Simplestkindofstructuredprediction:SequenceLabeling
![Page 36: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/36.jpg)
36
OOOB-PERSONI-PERSONOOOSomequestionedifTimCook’sfirstproduct
OOOOOOB-ORGANIZATIONOwouldbeabreakawayhitforApple.
NamedEntityRecognition
B=“begin”I=“inside”O=“outside”
FormulatingsegmentationtasksassequencelabelingviaB-I-Olabeling:
![Page 37: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/37.jpg)
ConstituentParsing(S(NPtheman)(VPwalked(PPto(NPthepark))))
37
themanwalkedtothepark
S
NP
NP
VP
PP
Key:S=sentenceNP=nounphraseVP=verbphrasePP=prepositionalphraseDT=determinerNN=nounVBD=verb(pasttense)IN=preposition
DT NN VBDINDTNN
![Page 38: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/38.jpg)
38
source: $ konnten sie es übersetzen ?
reference: $ could you translate it ?“wall”symbol
DependencyParsing
![Page 39: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/39.jpg)
Coreference ResolutionAsweheadtowardstrainingcamp,thePhiladelphia
Eagleshavefinallyfilledmostoftheirneedsonoffense.Oneofthemaingoalsforthisoff-seasonwastofind
weaponsfortheteam’sfranchisequarterback,CarsonWentz.TheEagles neededawidereceiverwhocouldstretchthefieldandgiveWentz theopportunitytothrowthelongball.They signedreceiverTorreySmithtoa3-yeardeal.
WhilethesigningofSmith washugefortheteam,thebiggestsigningtheEaglesmadewasformerChicagoBearsreceiverAlshon Jeffery.He hadasolid5-yearstintinChicago,butastheteamstartedtofallapart,Jefferywasforcedtoexploreotheroptions.
39
![Page 40: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/40.jpg)
Coreference Resolutioninput:adocumentoutput:asetof“mentions”(textualspansindocument),andmembershipsofthosementionsinclusters
40
![Page 41: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/41.jpg)
SemanticRoleLabelingApplications
` Question & answer systems
Who did what to whom at where?
30
The police officer detained the suspect at the scene of the crime
ARG0 ARG2 AM-loc V Agent ThemePredicate Location
J&M/SLP3
input:asentenceoutput:onespaninthesentenceidentifiedasapredicate,andasetofotherspansidentifiedasparticularrolesforthatpredicate
![Page 42: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/42.jpg)
SupervisedWordAlignment
42
givenparallelsentences,predictwordalignments:
Brownetal.(1990)
![Page 43: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/43.jpg)
konnten : could
konnten : could
konnten sie : could you
sie : you
sie : you
es übersetzen : translate it
sie es übersetzen : you translate it
übersetzen :translate
übersetzen :translate
es : it
es : it
es : it
? : ?
? : ?
MachineTranslation• phrase-basedmodel(Koehnetal.,2003):
input:asentenceinthesourcelanguageoutput:asegmentationofthesourcesentenceintosegments,atranslationofeachsegment,andanorderingofthetranslations
![Page 44: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/44.jpg)
• Ithinkofstructuredpredictionmethodsintwoprimarycategories:score-basedandsearch-based
KeyCategoriesofStructuredPrediction
44
![Page 45: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/45.jpg)
Score-BasedStructuredPrediction• focusondefiningthescorefunctionofthestructuredinput/outputpair:
• independencyparsing,thisiscalled“graph-basedparsing”becauseminimumspanningtreealgorithmscanbeusedtofindtheglobally-optimalmax-scoringtree
45
![Page 46: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/46.jpg)
Search-BasedStructuredPrediction• focusontheprocedureforsearchingthroughthestructuredoutputspace(usuallyinvolvessimplegreedyorbeamsearch)
• designaclassifiertoscoreasmallnumberofdecisionsateachpositioninthesearch• thisclassifiercanuseinformationaboutthecurrentstate
aswellastheentirehistoryofthesearch
• independencyparsing,thisiscalled“transition-basedparsing”becauseitconsistsofgreedily,sequentiallydecidingwhatparsingdecisiontomake
46
![Page 47: TTIC 31210 · 2017. 5. 22. · local maximum of the marginal likelihood of the observed data 7. EM ... 26 draw infinite probabilities from stick-breaking process with parameter s](https://reader034.fdocuments.in/reader034/viewer/2022051904/5ff5f6f469c35c32316a658d/html5/thumbnails/47.jpg)
StructuredPrediction• tomakeSPpractical,weneedtodecomposetheSPproblemintoparts
• thisistruewhetherwearegoingtousesearch-basedorscore-basedSP– score-based:scorefunctiondecomposesadditivelyintoscoresofparts
– search-based:searchfactorsintoasequenceofdecisions,eachoneaddingaparttothefinaloutputstructure
47