Texts as Knowledge Bases · Frame-semantic parsing attempts to identify predicates and their ......
Transcript of Texts as Knowledge Bases · Frame-semantic parsing attempts to identify predicates and their ......
TextsasKnowledgeBases
ChristopherManningJointworkwithGaborAngeliandDanqiChen
StanfordNLPGroup@chrmanning·@stanfordnlp
AKBC2016
MachineComprehension=MachinehasanAugmentedKnowledgeBase
“Amachinecomprehends apassageoftext if,foranyquestion regardingthattextthatcanbeansweredcorrectlybyamajorityofnativespeakers,thatmachinecanprovideastringwhichthosespeakerswouldagreebothanswersthatquestion,anddoesnotcontaininformationirrelevanttothatquestion.”
3
HowfardocurrentdeeplearningreadingcomprehensionsystemsgoinachievingChrisBurges’sgoal?
4
Twocasestudies…previewsofACL2016
Howcanweusenaturallogicandshallowreasoningtobettertreattextsasaknowledgebase?
DeepMindRCdataset[Hermannetal.2015]
5
DeepMindRCdataset
Largedataset
Reallanguage
GoodforDLtraining!
“Artificial”pre-processing(coref,anonymization)
Howhard?
Isitagoodtask?
6
ResultsonDeepMindRCwhenwebegan[Hermannetal.2015;Hilletal.2016]
SystemCNNDev
CNNTest
DailyMailDev
DailyMailTest
Frame-semanticmodel 36.3 40.2 35.5 35.5Worddistancemodel 50.5 50.9 56.4 55.5DeepLSTMReader 55.0 57.0 63.3 62.2AttentiveReader 61.6 63.0 70.5 69.0ImpatientReader 61.8 63.8 69.0 68.0MemNNwindowmemory 58.0 60.6MemNN window+selfsup 63.4 66.8MemNN win,ss,ens,no-c 66.2 69.4
7
Framesemanticsorsimplesyntax?
Frame-semanticparsingattemptstoidentifypredicatesandtheirsemanticarguments– shouldbegoodforquestionanswering!
Hermannetal.usea“state-of-the-art frame-semanticparser”–Googleversionof[Dasetal.2013,Hermannetal.2014]
Butframesemanticsystemshavecoverageproblems,notrepresentingpertinentrelationsnotmappedontoverbalframes
Howaboutagoodoldfeature-basedsystem,usingasyntacticdependencyparser?
8
SystemI:StandardEntity-CentricClassifier[Chen,Bolton,&Manning,ACL2016]
• Buildasymbolic featurevectorforeachentity:• Thegoalistolearnfeatureweightssuchthatthecorrectanswer
rankshigherthantheotherentities• TrainlogisticregressionandMARTclassifier(boosteddecision
trees– thesedobetterandarereported)
9
• Whethere isinthepassage• Whethere isinthequestion• Frequencyof e inpassage• Firstpositionofe inpassage• n-gramexactmatch (featuresformatchingL/R1/2words)• Worddistance ofquestionwordsinpassage• Whethereco-occurswithq verboranotherentity• Syntacticdependencyparsetriplematcharounde
Competent(traditional)statisticalNLP…
SystemCNNDev
CNNTest
DailyMailDev
DailyMailTest
Frame-semanticmodel 36.3 40.2 35.5 35.5ImpatientReader 61.8 63.8 69.0 68.0Competent statisticalNLP 67.1 67.9 69.1 68.3MemNN window+selfsup 63.4 66.8MemNN win,ss,ens,no-c 66.2 69.4
10
Ablatingindividualfeatures
11
SystemII:End-to-EndNeuralNetwork[Chen,Bolton,&Manning,ACL2016]
12
SystemII:End-to-EndNeuralNetwork
Nomagicatall;wemakeourmodelassimple aspossible• Learnedwordembeddingsfeedinto• Bi-directionalshallowLSTMsforpassageandquestion• Questionrepresentationusedforsoftattentionoverpassagewithsimplebilinearattentionfunction
• Afinalsoftmax layerpredictstheanswerentity• SGD,dropout(0.2),batchsize=32,hiddensize=128,…
13
Competentnew-fangledNLP…
System CNNDev CNNTest DMDev DMTestImpatientReader 61.8 63.8 69.0 68.0Competent statisticalNLP 67.1 67.9 69.1 68.3OurLSTMwithattention 72.4 72.4 76.9 75.8MemNN window+selfsup 63.4 66.8MemNN win,ss,ensem,no-c 66.2 69.4
14
Differences:Simplebilinearattention[Luong,Pham,&Manning2015]Hermannetal.hadanextra,unnecessarylayerjoiningoandqWepredictamongentities,notallwords(butdoesn’tmakeadifference)Maybewe’rebetterattuningneuralnets?Beendoingitforawhile.
OurResults
Wearequitehappywiththenumbers
[and,BTW,severalotherpeoplehavenowgottensimilarnumbers]
… but what do they really mean?
15
• Whatleveloflanguageunderstandingisneeded?• Whathavethemodelsactuallylearned?
DataAnalysis
Abreakdown oftheexamples
Exactmatch
Sentence-levelparaphrasing/textualentailment
PartialclueMultiplesentences
Coreferenceerrors
Ambiguousortoohard
16
DataAnalysis
• 25%:coreferenceerrors+hardcases• Only2% requiremultiplesentences
18
Data Analysis
19
Discussion
• TheDeepMindRCdataisquitenoisy• Therequiredreasoningandinferencelevelisquitelimited• Thereisn’tmuchroomleftforimprovement• However,thescaleandeaseofdataproductionisappealing• CanwemakeuseofthisdatainsolvingmorerealisticRCtasks?
• Neuralnetworksaregreat forlearningsemanticmatchesacrosslexicalvariationorparaphrasing!
• LSTMswith(simplebilinear)attentionaregreat!• NotyetprovenwhetherNNscandomorechallengingRCtasks20
AI24th GradeScienceQuestionAnswering[Angeli,Nayak,&Manning,ACL2016]
Our“knowledge”:
Ovariesarethefemalepartoftheflower,whichproduceseggsthatareneededformakingseeds.
Thequestion:
Whichpartofaplantproducestheseeds?
Theanswerchoices:
theflowertheleavesthestemtheroots
21
Howcanwerepresentandreasonwithbroad-coverageknowledge?
1. Rigid-schemaknowledgebaseswithwell-definedlogicalinference
2. Open-domainknowledgebases(OpenIE)– noclearontologyorinference[Etzionietal.2007ff]
3. HumanlanguagetextKB–Norigidschema,butwith“Naturallogic”candoformalinferenceoverhumanlanguagetext
22
TextasKnowledgeBase
Storingknowledgeastextiseasy!
Doinginferencesovertextmightbehard
Don’t want to run inference over every fact!
Don’t want to store all the inferences!
Inferences…ondemandfromaquery…[AngeliandManning2014]
…usingtextasthemeaning representation
NaturalLogic:logicalinferenceovertext
WearedoinglogicalinferenceThecatateamouse⊨ ¬Nocarnivoreseatanimals
WedoitwithnaturallogicIfImutateasentenceinthisway,doIpreserveitstruth?Post-DealIranAsksifU.S.IsStill‘GreatSatan,’orSomethingLess ⊨ACountryAsksifU.S.IsStill‘GreatSatan,’orSomethingLess
• Asoundandcompleteweaklogic[Icard andMoss2014]• Expressiveforcommonhumaninferences*• “Semantic”parsingisjustsyntacticparsing• Tractable:Polynomialtimeentailmentchecking• Playsnicelywithlexicalmatchingback-offmethods
#1.Commonsensereasoning
PolarityinNaturalLogic
Weorderphrasesinpartialorders(notjustis-a-kind-of,canalsodogeographicalcontainment,etc.)
Polarity isthedirectionaphrasecanmoveinthisorder
Exampleinferences
Quantifiersdeterminethepolarity ofphrases
Validmutationsconsider polarity
Successfultoyinference:
Allcatseatmice ⊨ Allhousecatsconsumerodents
“Soft”NaturalLogic
Wealsowanttomakelikely(butnotcertain)inferences
• SamemotivationasMarkovlogic,probabilisticsoftlogic,etc.
• Eachmutationedgetemplatehasacostθ ≥0
• Costofanedgeisθi ·fi
• Costofapathisθ ·f
• Canlearnparametersθ
• Inferenceisthengraphsearch
#2. Dealingwithreal,long sentences
Naturallogicworkswithfactsliketheseintheknowledgebase:ObamawasborninHawaii
Butreal-worldsentencesarecomplex:BorninHonolulu,Hawaii,Obama isagraduateofColumbiaUniversityandHarvardLawSchool,whereheservedaspresidentoftheHarvardLawReview.
Approach:1. Classifieryieldsentailedclausesfromalongsentence2. Shortenclauseswithnaturallogicinference
UniversalDependencies(UD)http://universaldependencies.github.io/docs/
Asingleleveloftypeddependencysyntaxthatgivesasimple,human-friendlyrepresentationofsentencestructureandmeaning
Betterthanaphrase-structuretreeformachineinterpretation–it’salmostasemanticnetwork
UDaimstobelinguisticallybetteracrosslanguagesthanearlier,common,simpleNLPrepresentations,suchasCoNLLdependencies
Generationofminimalclauses
1. Classificationproblem:givenadependencyedge,isitaclause?
2. Isitmissingacontrolledsubjectfromsubj/object?
3. Shortenclauseswhilepreservingvalidity!• Allyoungrabbits drinkmilk⊭
Allrabbits drinkmilk
• OK: SJC,thebayarea’sthirdlargestairport,isexperiencingdelaysduetoweather.
• Oftenbetter:SJCisexperiencingdelays.
Usingnaturallogic
#3.Addalexicalalignmentclassifier
• Sometimeswecan’tquitemaketheinferencesthatwewouldliketomake:
• Weuseasimplelexicalmatchback-offclassifierwithfeatures:• Matchingwords,mismatchedwords,unmatchedwords• Thesealwaysworkprettywell– the lessonofRTEevaluations
Thefullsystem
• Werunourusualsearchoversplitup,shortenedclauses• Ifwefindapremise,great!• Ifnot,weusethelexicalclassifierasanevaluationfunction
• Weworktodothisquickly• Visit1Mnodes/second,don’trefeaturize,justdelta• 32bytesearchstates(thanksGabor!)
Solving4th gradescience(AllenAIdatasets)
Multiplechoicequestionsfromreal4th gradescienceexamsWhichactivityisanexampleofagoodhealthhabit?
(A)Watchingtelevision(B)Smokingcigarettes(C)Eatingcandy(D)Exercisingeveryday
Inourcorpus knowledgebase:• PlasmaTV’scandisplayupto16millioncolors...greatfor
watchingTV...alsomakeagoodscreen.• Notsmokingordrinkingalcoholisgoodforhealth,regardless of
whetherclothingiswornornot.• Eatingcandyfordinerisanexampleofapoorhealthhabit.• Healthyisexercising
Solving4th gradescience(AllenAINDMC)
System Dev TestKnowBot[Hixon etal.NAACL2015] 45 –KnowBot(Oracle– humaninloop) 57 –IRbaseline(Lucene) 49 42NaturalLI 52 51Moredata+IRbaseline 62 58Moredata+NaturalLI 65 61NaturalLI+🔔 +(lex.classifier) 74 67Aristo[Clarketal.2016]6systems,evenmoredata 71
Testset:NewYorkRegents4thGradeScienceexammultiple-choice questions fromAI2Training:BasicisBarron’sstudyguide;moredataisSciText corpus fromAI2.Score:%correct
Envoi
Canourknowledgebasejustbetext?
Naturallogicprovidesauseful,formal(weak)logicfortextualinference
Naturallogiciseasilycombinablewithlexicalmatchingmethods,includingneuralnetmethods
Theresultingsystemisusefulfor:
• Common-sensereasoning
• QuestionAnswering
• Also,OpenInformationExtraction