CS395T: Structured Models for NLP Lecture 22: Summariza
Transcript of CS395T: Structured Models for NLP Lecture 22: Summariza
Varia<onalAutoencoders
x
Input
q(z|x)
x
distribu<onoverz
MaximizeP(x|z,θ)
“inferencenetwork”
genera<vemodelx
z
Genera<vemodel(test): Autoencoder(training):
Miaoetal.(2015)
Eq(z|x)[logP (x|z, ✓)]�KL(q(z|x)kP (z))
TrainingVAEs
x
q(z|x)
x
“inferencenetwork”
genera<vemodel
Autoencoder(training):Foreachexamplex
Computeq(runforwardpasstocomputemuandsigma)
Samplez~q
Forsomenumberofsamples
ComputeP(x|z)andcomputeloss
Backpropagatetoupdatephi,theta
�
✓
Autoencoders
themoviewasgreat
the
<s>
movie was good [STOP]
‣ Inferencenetwork(q)istheencoderandgeneratoristhedecoder
+
Gaussiannoise
‣ Samecomputa<ongraphasVAEwithreparameteriza<on,addKLtermtomaketheobjec<vethesame
‣ Duringtraining,addGaussiannoiseandforcethenetworktopredict
ThisLecture‣ Extrac<vesystemsformul<-documentsummariza<on
‣ Extrac<ve+compressivesystemsforsingle-documentsummariza<on
‣ Single-documentsummariza<onwithneuralnetworks
Summariza<onBAGHDAD/ERBIL,Iraq(Reuters)-AstrongearthquakehitlargepartsofnorthernIraqandthecapitalBaghdadonSunday,andalsocauseddamageinvillagesacrosstheborderinIranwherestateTVsaidatleastsixpeoplehadbeenkilled.
Therewerenoimmediatereportsofcasual<esinIraqaierthequake,whoseepicenterwasinPenjwin,inSulaimaniyahprovincewhichisinthesemi-autonomousKurdistanregionveryclosetotheIranianborder,accordingtoanIraqimeteorologyofficial.
ButeightvillagesweredamagedinIranandatleastsixpeoplewerekilledandmanyothersinjuredinthebordertownofQasr-eShirininIran,IranianstateTVsaid.
TheUSGeologicalSurveysaidthequakemeasuredamagnitudeof7.3,whileanIraqimeteorologyofficialputitsmagnitudeat6.5accordingtopreliminaryinforma<on.
ManyresidentsintheIraqicapitalBaghdadrushedoutofhousesandtallbuildingsinpanic.…
Summariza<onIndianExpress—Amassiveearthquakeofmagnitude7.3struckIraqonSunday,103kms(64miles)southeastofthecityofAs-Sulaymaniyah,theUSGeologicalSurveysaid,reportsReuters.USGeologicalSurveyini<allysaidthequakewasofamagnitude7.2,beforerevisingitto7.3.
ThequakehasbeenfeltinseveralIranianci<esandeightvillageshavebeendamaged.Electricityhasalsobeendisruptedatmanyplaces,suggestfewTVreports.
Amassiveearthquakeofmagnitude7.3struckIraqonSunday.TheepicenterwasclosetotheIranianborder.EightvillagesweredamagedandsixpeoplewerekilledinIran.
Summary
Whatmakesagoodsummary?
Astrongearthquakeofmagnitude7.3struckIraqandIranonSunday.TheepicenterwasclosetotheIranianborder.EightvillagesweredamagedandsixpeoplewerekilledinIran.
Summary
‣ Contentselec<on:picktherightcontent‣ Rightcontentwasrepeatedwithinandacrossdocuments
‣ Domain-specific(magnitude+epicenterofearthquakesareimportant)
‣ Genera<on:writethesummary‣ Extrac<on:pickwholesentencesfromthesummary‣ Compression:compressthosesentencesbutbasicallyjustdodele<on‣ Abstrac<on:rewrite+reexpresscontentfreely
Extrac<veSummariza<on:MMR‣ Givensomear<clesandalengthbudgetofkwords,picksomesentencesoftotallength<=kandmakeasummary‣ Pickimportantyetdiversecontent:maximummarginalrelevance(MMR)
Whilesummaryis<kwordsCalculate
AddhighestMMRsentencethatdoesn’toverflowlength
“makethissentencesimilartoaquery”
“makethissentencemaximallydifferentfromallothersaddedsofar”
“maxoverallsentencesnotyetinthesummary”
CarbonellandGoldstein(1998)
Extrac<veSummariza<on:Centroid‣ Representthedocumentsandeachsentencesasbag-of-wordswithTF-IDFweigh<ng
Whilesummaryis<kwords
Calculatescore(sentence)=cosine(sent-vec,doc-vec)
Radevetal.(2004)
Discardallsentenceswhosesimilaritywithsomesentencealreadyinthesummaryistoohigh
Addthebestremainingsentencethatwon’toverflowthesummary
Extrac<veSummariza<on:BigramRecall‣ Countnumberofdocumentseachbigramoccursintomeasureimportance
GillickandFavre(2009)
score(massiveearthquake)=3score(Iraqicapital)=1score(sixkilled)=2score(magnitude7.3)=2
‣ ILPformula<on:candsareindicatorvariablesindexedoverconcepts(bigrams)andsentences,respec<vely
“setcito1iffsomesentencethatcontainsitisincluded”
sumofincludedsentences’lengthscan’texceedL
‣ Findsummarythatmaximizesthescoreofbigramsitcovers
Evalua<on:ROUGE‣ Rouge-n:n-gramrecallofsummaryw.r.t.goldstandard
Lin(2004)
‣ Rouge-2correlateswellwithhumanjudgmentsformul<-documentsummariza<ontasks
Results
Ghalandri(2017)
Bexercentroid:38.589.731.53
GillickandFavre/bigramrecall
‣ Caveat:thesetechniquesallworkbexerformul<-documentthansingle-document!
Mul<-Documentvs.SingleDocument
‣ “amassiveearthquakehitIraq”“amassiveearthquakestruckIraq”—lotsofredundancytohelpselectcontentinmul<-documentcase
‣Whenyouhavealotofdocuments,therearemorepossiblesentencestoextract:
ButeightvillagesweredamagedinIranandatleastsixpeoplewerekilledandmanyothersinjuredinthebordertownofQasr-eShirininIran,IranianstateTVsaid.
ThequakehasbeenfeltinseveralIranianci<esandeightvillageshavebeendamaged.
‣Mul<-documentsummariza<oniseasier?
CompressiveSummariza<onIndianExpress—Amassiveearthquakeofmagnitude7.3struckIraqonSunday,103kms(64miles)southeastofthecityofAs-Sulaymaniyah,theUSGeologicalSurveysaid,reportsReuters.USGeologicalSurveyini<allysaidthequakewasofamagnitude7.2,beforerevisingitto7.3.
‣ Sentenceextrac<onisn’taggressiveenoughatremovingirrelevantcontent
‣Wanttoextractsentencesandalsodeletecontentfromthem
Syntac<cCuts
‣ Deleteadjuncts
Amassiveearthquakeofmagnitude7.3struckIraqonSunday,103kms(64miles)…
NP
NP-LOCVBD NP NP-TMP
VP
S
‣ Usesyntac<crulestomakecertaindele<ons
Syntac<cCuts
‣ Deletesecondpartsofcoordina<onstructures
Atleastsixpeoplewerekilledandmanyothersinjured
S SCC
S
‣ Usesyntac<crulestomakecertaindele<ons
CompressiveILP‣ RecalltheGillick+FavreILP:
‣ Nowsjvariablesarenodesorsetsofnodesintheparsetree
Atleastsixpeoplewerekilledandmanyothersinjured
S SCC
S
s2s1
‣ Newconstraint:s2≤s1“s1isaprerequisitefors2”
Thishasn'tbeenKellogg'syear.
CompressiveSummariza<on
Itspresidentquitsuddenly.
Theoat-brancrazehascostitmarketshare.
AndnowKelloggiscancelingitsnewcerealplant,whichwouldhavecost$1billion.SBARNP
NP
Theoat-brancrazehascostKelloggmarketshare.
x1
x2
x3
x4
x5
max
x
�w>f(x)
�s.t. summary(x) obeys length limit
summary(x) is grammatical
summary(x) is coherent
ILP:
Constraints
max
x
�w>f(x)
�
Gramma<calityconstraints:allowcutswithinsentencesCoreferenceconstraints:donotallowpronounsthatwouldrefertonothing
‣ Otherwise,forceitsantecedenttobeincludedinthesummary‣ Ifwe’reconfidentaboutcoreference,rewritethepronoun(itKellogg)
s.t. summary(x) obeys length limit
summary(x) is grammatical
summary(x) is coherent
Durrexetal.(2016)
Features
max
x
�w>f(x)
�
AndnowKelloggiscancelingitsnewcerealplantf ( )=(SentenceIndex=4)
(FirstWord=And)
(NumContentWords=4)I
I
I
}s.t. summary(x) obeys length limit
summary(x) is grammatical
summary(x) is coherent
Documentposi<on:
Lexicalfeatures:
Centrality:‣ Nowusesafeature-basedmodel,wherefeaturesiden<fygoodcontent
Learning
max
x
�w>f(x)
�
‣ StructuredSVMwithROUGEaslossfunc<on
‣ TrainonalargecorpusofNewYorkTimesdocumentswithsummaries(100,000documents)
s.t. summary(x) obeys length limit
summary(x) is grammatical
summary(x) is coherent
‣ AugmenttheILPtokeeptrackofwhichbigramsareincludedornot,usetheseforloss-augmenteddecode
Berg-Kirkpatricketal.(2011),Durrexetal.(2016)
30
35
40
45
50
5 6 7 8 9 10
Results:NewYorkTimesCorpus
Linguis<cQuality(HumanstudyonMechanicalTurk)
Contentfidelity(ROUGE-1:wordoverlapwithreference) Firstnwords
Extractsentences
Extractclauses Fullsystem
Yoshidaetal.(2014)
Seq2seqSummariza<on
‣ Extrac<veparadigmisn’tallthatflexible,evenwithcompression
‣ Trainingishard!ILPsarehard!Maybejustuseseq2seq?
Itspresidentquitsuddenly
The
<s>
oat bran craze has
… …
Chopraetal.(2016)
‣ Traintoproducesummarybasedondocument
Seq2seqSummariza<on
Chopraetal.(2016)
‣ Task:generateheadlinefromfirstsentenceofar<cle(cangetlotsofdata!)
noaxen<onwithaxen<on
referencesentence
‣Worksprexywell,thoughthesemodelscangenerateincorrectsummaries…
‣Whathappensifwetrythisonalongerar<cle?
Seq2seqSummariza<on
Seeetal.(2017)
‣ Solu<ons:copymechanism,coverage,justlikeinMT…
‣ Thingsmights<llgowrong,nowayofpreven<ngthis…
NeuralSystems:Results
Seeetal.(2017)
‣ Abstrac<vesystemsdon’tbeata“lead”baselineonROUGE(lessn-gramoverlap)
‣ Copymechanismandcoveragehelpsubstan<ally
ChallengesofSummariza<on
‣ Trueabstrac<on?‣ Notreallynecessaryforar<cles‣ Genera<ngfromstructuredinforma<oncanusuallybedonewithtemplates…
summaryar<cle extrac<on
abstrac<on
compressionsyntax
seman<cs??? seman<cs???
syntax