And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 ·...

41
And now for something completely different

Transcript of And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 ·...

Page 1: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Andnowforsomethingcompletelydifferent

Page 2: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

AlgorithmsforNLP(11-711)Fall2017

FormalLanguageTheoryInonelecture

RobertFrederking

Page 3: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

NowforSomethingCompletelyDifferent

• Wewilllookatgrammarsfroma“mathematical”pointofview

• ButDiscreteMath(logic)– Norealnumbers– Symbolicdiscretestructures,proofs

• Thisisthesourceofmanycommonalgorithms/models

• Interestedincomplexity/powerofdifferentformalmodelsofcomputation– Relatedtoasymptoticcomplexitytheory

Page 4: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Twomainclassesofmodels

• Automata–Machines,likeFinite-StateAutomata

• Grammars– Rulesets,likewehavebeenusingtoparse

• Wewilllookateachclassofmodel,goingfromsimplertomorecomplex/powerful

• Wecanformallyprovecomplexity-classrelationsbetweentheseformalmodels

Page 5: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Simplestlevel:FSA/Regularsets

Page 6: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Finite-StateAutomata(FSAs)

• Simplestformalautomata• We’veseenthesewithnumbersonthemasHMMs,etc.

(fromWikipedia)

Page 7: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Formaldefinitionofautomata

• Afinitesetofstates,Q• Afinitealphabetofinputsymbols,Σ• Aninitial(start)state,Q0∈Q• Asetoffinalstates,Fi∈Q• Atransitionfunction,δ:QxΣ →Q

• ThisrigorouslydefinestheFSAsweusuallyjustdrawascirclesandarrows

Page 8: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

RegularGrammars

• Left-linearorright-lineargrammars• Left-lineartemplate:

A→ Bx orA→x• Right-lineartemplate:

A→ xB orA→x

• Example:S→aA |bB |ε ,A→aS ,B→bbS

Page 9: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

FormalDefinitionofaGrammar

• Vocabularyofterminalsymbols,Σ (e.g.,a)• Setofnonterminalsymbols,N(e.g.,A)• Specialstartsymbol,S∈ N• Productionrules,suchasA→aB• Restrictionsontherulesdeterminewhatkindofgrammaryouhave

• AformalgrammarGdefinesaformallanguage,L(G),thesetofstringsitgenerates

Page 10: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

RegularExpressions

• Forregulargrammars,there’sasimplerwaytowriteexpressions:regularexpressions:

Terminalsymbols(r+s)(r•s)r*ε

• Forexample:(aa+bbb)*

Page 11: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Amazingfact#1:FSAsareequivalenttoRGs

• Proof:twoconstructiveproofs:– 1:givenanarbitraryFSA,constructthecorrespondingRegularGrammar(andprovethatitwillonlyproducethestringstheFSAwould)

– 2:givenanarbitraryRegularGrammar,constructthecorrespondingFSA(andprovethatitwillonlyproducethestringsthegrammarwould)

Page 12: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

DFSAs,NDFSAs

• DeterministicorNon-deterministic– Isδ functionambiguousornot?

– ForFSAs,weaklyequivalent

Page 13: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Intersecting,etc.,FSAs

• WecaninvestigatewhathappensafterperformingdifferentoperationsonFSAs:– Union– Intersection– Concatenation– Negation– otheroperations:determinizing andminimizingFSAs

Page 14: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Provingalanguageisnot regular

• So,whatkindsoflanguagesarenot regular?

• Informally,aFSAcanonlyremember afinitenumberofspecific things.Soalanguagerequiringanunboundedmemorywon’tberegular.

Page 15: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Provingalanguageisnot regular

• So,whatkindsoflanguagesarenot regular?

• Informally,aFSAcanonlyremember afinitenumberofspecific things.Soalanguagerequiringanunboundedmemorywon’tberegular.

• Howaboutanbn? “equalcountofa’s and b’s”

Page 16: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

PumpingLemma:argument:

• ConsideramachinewithNstates• NowconsideraninputoflengthN;sincewestartedinQ0,wewillnowbeinthe(N+1)ststatevisited

• Theremust bealoop:wehadtovisitatleast1statetwice;letxbethestringuptotheloop,ythepartintheloop,andzaftertheloop

• SoitmustbeokaytoalsohaveMcopiesofyforanyM(including0copies)

Page 17: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

PumpingLemma:formally:

• IfLisaninfiniteregularlanguage,thentherearestringsx,y,andzsuchthaty≠ε andxynz∈ L,foralln≥0.

• xyzbeinginthelanguagerequiresalso:• xz,xyyz,xyyyz,xyyyyz,…, xyyyyyyyyyyz,…

Page 18: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

PumpingLemma:figure:

q0 qNqx z

y

Page 19: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

ExampleproofthataLisnotregular

• Whataboutanbn?abaabbaaabbbaaaabbbbaaaaabbbbb…

• Wheredoyoudrawthexynz lines?

Page 20: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

ExampleproofthataLisnotregular

• Whataboutanbn?Wheredoyoudrawthelines?• Threecases:– y isonlya’s:thenxynz willhavetoomanya’s– y isonlyb’s:thenxynz willhavetoomanyb’s– y isamix:thentherewillbeintersperseda’sandb’s

• Soanbn cannotberegular,sinceitcannotbepumped

Page 21: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Nextlevel:PDA/CFG

Page 22: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Push-DownAutomata(PDAs)

• Let’saddsomeunboundedmemory,butinalimitedfashion

• So,addastack:

• Allowsyoutohandlesomenon-regularlanguages,butnoteverything

Page 23: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Context-FreeGrammars

• Ruletemplate:A→γ whereγ isanysequenceofterminals/non-terminals

• Example:S→aSb|ε• WeusethesealotinNLP– Expressiveenough,nottoocomplextoparse.• Weoftenaddhackstoallownon-CFinformationflow.

– Itjustreallyfeelsliketherightlevelofanalysis.• (Moreonthislater.)

Page 24: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

AmazingFact#2:PDAsandCFGsareequivalent

• SamekindofproofasforFSAsandRGs,butmorecomplicated

• Aretherenon-CFlanguages?Howaboutanbncn?

Page 25: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Highestlevel:TMs/Unrestrictedgrammars

Page 26: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

TuringMachines

• Justletthemachinemoveandwriteonthetape:

• Thissimplechangeproducesgeneral-purposecomputer:Church-TuringHypothesis

Page 27: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

TMmadeofLEGOs

Page 28: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

UnrestrictedGrammars

• α→β,whereeachcanbeanysequence(αnotempty)

• Thus,thereiscontext intherules:aAb →aabbAb →bbb

• Nosurpriseatthispoint:equivalenttoTMs

Page 29: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Evenmoreamazingfact:Chomskyhierarchy

• Provablethateachofthesefourclassesisapropersubsetofthenextone:

Type0:TMType1:CSGType2:CFGType3:RE

01

* 2 3

Page 30: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Linear-BoundedAutomata/Context-SensitiveGrammars

• TMthatusesspacelinearintheinput• αAβ→αγβ(γ notempty)

• Wemostlyignorethese;theygetnorespect• Correspondtoeachother• Limitedcomparedtofull-blownTM– Butcomplexitycanalreadybeundecidable

Page 31: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

ChomskyHierarchy:proofs

• Formofhierarchyproofs:– Foreachclass,youcanprovetherearelanguagesnotintheclass,similartoPumpingLemmaproof

– Youcaneasilyprovethatthelargerclassreallydoescontainalltheonesinthesmallerclass

Page 32: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Intersecting,etc.,Ls

• WecanagaininvestigatewhathappenswithLsinthesevariousclassesunderdifferentoperationsonLs:– Union– Intersection– Concatenation– Negation– otheroperations

Page 33: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Chomskyhierarchy:table

Page 34: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

MildlyContext-SensitiveGrammars

• WereallylikeCFGs,butaretheyinfactexpressiveenoughtocaptureallhumangrammar?

• Manyapproachesstartwitha“CFbackbone”,andaddregisters,equations,etc.,thatarenot CF.

• Severalnon-hackextensions(CCG,TAG,etc.)turnouttobeweaklyequivalent!– “Mildlycontextsensitive”– SoCSFsgetevenlessrespect…– AndsomuchfortheChomskyHierarchybeingsuchabigdeal

Page 35: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Tryingtoprovehumanlanguagesarenot CF

• Certainlytrueofsemantics.ButNLsyntax?• Cross-serialdependenciesseemlikeagoodtarget:–Mary,Jane,andJimlikered,green,andblue,respectively.

– Butisthissyntactic?• Surprisinglyhardtoprove

Page 36: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

SwissGermandialect!dative-NPaccusative-NPdative-taking-VPaccusative-taking-VP

•JansaitdasmeremHanseshuushalfedaastriiche•JansaysthatweHansthehousehelpedpaint•“JansaysthatwehelpedHanspaintthehouse”•Jansaitdasmerd’chindemHanseshuushaendwelelaahalfeaastriiche•JansaysthatwethechildrenHansthehousehavewantedtolethelppaint•“JansaysthatwehavewantedtoletthechildrenhelpHanspaintthehouse”

(Alittlelike“Thecatthedogthemousescaredchasedlikestunafish”)

Page 37: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

IsSwissGermanContext-Free?

Shieber’scomplexargument…

L1=Jansaitdasmer(d’chind)*(emHans)*eshuushaendwele(laa)*(halfe)*aastriiche

L2=SwissGerman

L1∩L2=Jansaitdasmer(d’chind)n(emHans)meshuushaendwele(laa)n(halfe)maastriiche

Page 38: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Whydowecare?(1)

• Mathisfun?• Complexity:– IfyoucanuseaRE,don’tuseaCFG.– BecarefulwithanythingfancierthanaCFG.

• Safety:hardertowritecorrectsystemsonaTuringMachine.

• Beingabletouseaweakerformalismmayhaveexplanatorypower?

Page 39: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

Whydowecare?(2)

• Probablyasourceforfuturenewalgorithms• Probablynot howhumansactuallyprocessNL• MightnotmatterasmuchforNLPnowthatweknowaboutrealnumbers?– Butwedon’twantyourfriendsmakingfunofyou

Page 40: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”
Page 41: And now for something completely differenttbergkir/11711fa17/11711 FLT F17.pdf · 2017-10-04 · Now for Something Completely Different •We will look at grammars from a “mathematical”

MoreExamples

•Thecatlikestunafish•Thecatthedogchasedlikestunafish•Thecatthedogthemousescaredchasedlikestunafish•Thecatthedogthemousetheelephantsquashedscaredchasedlikestunafish•Thecatthedogthemousetheelephantthefleabitsquashedscaredchasedlikestunafish•Thecatthedogthemousetheelephantthefleathevirusinfectedbitsquashedscaredchasedlikestunafish