Texts as Knowledge Bases · Frame-semantic parsing attempts to identify predicates and their ......

TextsasKnowledgeBases

ChristopherManningJointworkwithGaborAngeliandDanqiChen

StanfordNLPGroup@chrmanning·@stanfordnlp

AKBC2016

MachineComprehension=MachinehasanAugmentedKnowledgeBase

“Amachinecomprehends apassageoftext if,foranyquestion regardingthattextthatcanbeansweredcorrectlybyamajorityofnativespeakers,thatmachinecanprovideastringwhichthosespeakerswouldagreebothanswersthatquestion,anddoesnotcontaininformationirrelevanttothatquestion.”

3

HowfardocurrentdeeplearningreadingcomprehensionsystemsgoinachievingChrisBurges’sgoal?

4

Twocasestudies…previewsofACL2016

Howcanweusenaturallogicandshallowreasoningtobettertreattextsasaknowledgebase?

DeepMindRCdataset[Hermannetal.2015]

5

DeepMindRCdataset

Largedataset

Reallanguage

GoodforDLtraining!

“Artificial”pre-processing(coref,anonymization)

Howhard?

Isitagoodtask?

6

ResultsonDeepMindRCwhenwebegan[Hermannetal.2015;Hilletal.2016]

SystemCNNDev

CNNTest

DailyMailDev

DailyMailTest

Frame-semanticmodel 36.3 40.2 35.5 35.5Worddistancemodel 50.5 50.9 56.4 55.5DeepLSTMReader 55.0 57.0 63.3 62.2AttentiveReader 61.6 63.0 70.5 69.0ImpatientReader 61.8 63.8 69.0 68.0MemNNwindowmemory 58.0 60.6MemNN window+selfsup 63.4 66.8MemNN win,ss,ens,no-c 66.2 69.4

7

Framesemanticsorsimplesyntax?

Frame-semanticparsingattemptstoidentifypredicatesandtheirsemanticarguments– shouldbegoodforquestionanswering!

Hermannetal.usea“state-of-the-art frame-semanticparser”–Googleversionof[Dasetal.2013,Hermannetal.2014]

Butframesemanticsystemshavecoverageproblems,notrepresentingpertinentrelationsnotmappedontoverbalframes

Howaboutagoodoldfeature-basedsystem,usingasyntacticdependencyparser?

8

SystemI:StandardEntity-CentricClassifier[Chen,Bolton,&Manning,ACL2016]

• Buildasymbolic featurevectorforeachentity:• Thegoalistolearnfeatureweightssuchthatthecorrectanswer

rankshigherthantheotherentities• TrainlogisticregressionandMARTclassifier(boosteddecision

trees– thesedobetterandarereported)

9

• Whethere isinthepassage• Whethere isinthequestion• Frequencyof e inpassage• Firstpositionofe inpassage• n-gramexactmatch (featuresformatchingL/R1/2words)• Worddistance ofquestionwordsinpassage• Whethereco-occurswithq verboranotherentity• Syntacticdependencyparsetriplematcharounde

Competent(traditional)statisticalNLP…

SystemCNNDev

CNNTest

DailyMailDev

DailyMailTest

Frame-semanticmodel 36.3 40.2 35.5 35.5ImpatientReader 61.8 63.8 69.0 68.0Competent statisticalNLP 67.1 67.9 69.1 68.3MemNN window+selfsup 63.4 66.8MemNN win,ss,ens,no-c 66.2 69.4

10

Ablatingindividualfeatures

11

SystemII:End-to-EndNeuralNetwork[Chen,Bolton,&Manning,ACL2016]

12

SystemII:End-to-EndNeuralNetwork

Nomagicatall;wemakeourmodelassimple aspossible• Learnedwordembeddingsfeedinto• Bi-directionalshallowLSTMsforpassageandquestion• Questionrepresentationusedforsoftattentionoverpassagewithsimplebilinearattentionfunction

• Afinalsoftmax layerpredictstheanswerentity• SGD,dropout(0.2),batchsize=32,hiddensize=128,…

13

Competentnew-fangledNLP…

System CNNDev CNNTest DMDev DMTestImpatientReader 61.8 63.8 69.0 68.0Competent statisticalNLP 67.1 67.9 69.1 68.3OurLSTMwithattention 72.4 72.4 76.9 75.8MemNN window+selfsup 63.4 66.8MemNN win,ss,ensem,no-c 66.2 69.4

14

Differences:Simplebilinearattention[Luong,Pham,&Manning2015]Hermannetal.hadanextra,unnecessarylayerjoiningoandqWepredictamongentities,notallwords(butdoesn’tmakeadifference)Maybewe’rebetterattuningneuralnets?Beendoingitforawhile.

OurResults

Wearequitehappywiththenumbers

[and,BTW,severalotherpeoplehavenowgottensimilarnumbers]

… but what do they really mean?

15

• Whatleveloflanguageunderstandingisneeded?• Whathavethemodelsactuallylearned?

DataAnalysis

Abreakdown oftheexamples

Exactmatch

Sentence-levelparaphrasing/textualentailment

PartialclueMultiplesentences

Coreferenceerrors

Ambiguousortoohard

16

DataAnalysis

• 25%:coreferenceerrors+hardcases• Only2% requiremultiplesentences

18

Data Analysis

19

Discussion

• TheDeepMindRCdataisquitenoisy• Therequiredreasoningandinferencelevelisquitelimited• Thereisn’tmuchroomleftforimprovement• However,thescaleandeaseofdataproductionisappealing• CanwemakeuseofthisdatainsolvingmorerealisticRCtasks?

• Neuralnetworksaregreat forlearningsemanticmatchesacrosslexicalvariationorparaphrasing!

• LSTMswith(simplebilinear)attentionaregreat!• NotyetprovenwhetherNNscandomorechallengingRCtasks20

AI24th GradeScienceQuestionAnswering[Angeli,Nayak,&Manning,ACL2016]

Our“knowledge”:

Ovariesarethefemalepartoftheflower,whichproduceseggsthatareneededformakingseeds.

Thequestion:

Whichpartofaplantproducestheseeds?

Theanswerchoices:

theflowertheleavesthestemtheroots

21

Howcanwerepresentandreasonwithbroad-coverageknowledge?

1. Rigid-schemaknowledgebaseswithwell-definedlogicalinference

2. Open-domainknowledgebases(OpenIE)– noclearontologyorinference[Etzionietal.2007ff]

3. HumanlanguagetextKB–Norigidschema,butwith“Naturallogic”candoformalinferenceoverhumanlanguagetext

22

TextasKnowledgeBase

Storingknowledgeastextiseasy!

Doinginferencesovertextmightbehard

Don’t want to run inference over every fact!

Don’t want to store all the inferences!

Inferences…ondemandfromaquery…[AngeliandManning2014]

…usingtextasthemeaning representation

NaturalLogic:logicalinferenceovertext

WearedoinglogicalinferenceThecatateamouse⊨ ¬Nocarnivoreseatanimals

WedoitwithnaturallogicIfImutateasentenceinthisway,doIpreserveitstruth?Post-DealIranAsksifU.S.IsStill‘GreatSatan,’orSomethingLess ⊨ACountryAsksifU.S.IsStill‘GreatSatan,’orSomethingLess

• Asoundandcompleteweaklogic[Icard andMoss2014]• Expressiveforcommonhumaninferences*• “Semantic”parsingisjustsyntacticparsing• Tractable:Polynomialtimeentailmentchecking• Playsnicelywithlexicalmatchingback-offmethods

#1.Commonsensereasoning

PolarityinNaturalLogic

Weorderphrasesinpartialorders(notjustis-a-kind-of,canalsodogeographicalcontainment,etc.)

Polarity isthedirectionaphrasecanmoveinthisorder

Exampleinferences

Quantifiersdeterminethepolarity ofphrases

Validmutationsconsider polarity

Successfultoyinference:

Allcatseatmice ⊨ Allhousecatsconsumerodents

“Soft”NaturalLogic

Wealsowanttomakelikely(butnotcertain)inferences

• SamemotivationasMarkovlogic,probabilisticsoftlogic,etc.

• Eachmutationedgetemplatehasacostθ ≥0

• Costofanedgeisθi ·fi

• Costofapathisθ ·f

• Canlearnparametersθ

• Inferenceisthengraphsearch

#2. Dealingwithreal,long sentences

Naturallogicworkswithfactsliketheseintheknowledgebase:ObamawasborninHawaii

Butreal-worldsentencesarecomplex:BorninHonolulu,Hawaii,Obama isagraduateofColumbiaUniversityandHarvardLawSchool,whereheservedaspresidentoftheHarvardLawReview.

Approach:1. Classifieryieldsentailedclausesfromalongsentence2. Shortenclauseswithnaturallogicinference

UniversalDependencies(UD)http://universaldependencies.github.io/docs/

Asingleleveloftypeddependencysyntaxthatgivesasimple,human-friendlyrepresentationofsentencestructureandmeaning

Betterthanaphrase-structuretreeformachineinterpretation–it’salmostasemanticnetwork

UDaimstobelinguisticallybetteracrosslanguagesthanearlier,common,simpleNLPrepresentations,suchasCoNLLdependencies

Generationofminimalclauses

1. Classificationproblem:givenadependencyedge,isitaclause?

2. Isitmissingacontrolledsubjectfromsubj/object?

3. Shortenclauseswhilepreservingvalidity!• Allyoungrabbits drinkmilk⊭

Allrabbits drinkmilk

• OK: SJC,thebayarea’sthirdlargestairport,isexperiencingdelaysduetoweather.

• Oftenbetter:SJCisexperiencingdelays.

Usingnaturallogic

#3.Addalexicalalignmentclassifier

• Sometimeswecan’tquitemaketheinferencesthatwewouldliketomake:

• Weuseasimplelexicalmatchback-offclassifierwithfeatures:• Matchingwords,mismatchedwords,unmatchedwords• Thesealwaysworkprettywell– the lessonofRTEevaluations

Thefullsystem

• Werunourusualsearchoversplitup,shortenedclauses• Ifwefindapremise,great!• Ifnot,weusethelexicalclassifierasanevaluationfunction

• Weworktodothisquickly• Visit1Mnodes/second,don’trefeaturize,justdelta• 32bytesearchstates(thanksGabor!)

Solving4th gradescience(AllenAIdatasets)

Multiplechoicequestionsfromreal4th gradescienceexamsWhichactivityisanexampleofagoodhealthhabit?

(A)Watchingtelevision(B)Smokingcigarettes(C)Eatingcandy(D)Exercisingeveryday

Inourcorpus knowledgebase:• PlasmaTV’scandisplayupto16millioncolors...greatfor

watchingTV...alsomakeagoodscreen.• Notsmokingordrinkingalcoholisgoodforhealth,regardless of

whetherclothingiswornornot.• Eatingcandyfordinerisanexampleofapoorhealthhabit.• Healthyisexercising

Solving4th gradescience(AllenAINDMC)

System Dev TestKnowBot[Hixon etal.NAACL2015] 45 –KnowBot(Oracle– humaninloop) 57 –IRbaseline(Lucene) 49 42NaturalLI 52 51Moredata+IRbaseline 62 58Moredata+NaturalLI 65 61NaturalLI+🔔 +(lex.classifier) 74 67Aristo[Clarketal.2016]6systems,evenmoredata 71

Testset:NewYorkRegents4thGradeScienceexammultiple-choice questions fromAI2Training:BasicisBarron’sstudyguide;moredataisSciText corpus fromAI2.Score:%correct

Envoi

Canourknowledgebasejustbetext?

Naturallogicprovidesauseful,formal(weak)logicfortextualinference

Naturallogiciseasilycombinablewithlexicalmatchingmethods,includingneuralnetmethods

Theresultingsystemisusefulfor:

• Common-sensereasoning

• QuestionAnswering

• Also,OpenInformationExtraction

Texts as Knowledge Bases · Frame-semantic parsing attempts to identify predicates and their ......

Documents

Transcript of Texts as Knowledge Bases · Frame-semantic parsing attempts to identify predicates and their ......