Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf ·...

18
Variational Autoencoders Write Poetry (Generating Sentences from a Continuous Space) Elsbeth Turcan and Fei-Tzin Lee Paper by Sam Bowman, Luke Vilnis et al. 2016

Transcript of Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf ·...

Page 1: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

VariationalAutoencoders Write Poetry

(Generating Sentences from a Continuous Space)

Elsbeth Turcan andFei-TzinLee

PaperbySamBowman,LukeVilnis etal.

2016

Page 2: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Motivation

– Generativemodelsfornaturallanguagesentences

– Machinetranslation

– Imagecaptioning

– Datasetsummarization

– Chatbots

– Etc.

– Wanttocapturehigh-levelfeaturesoftextsuchastopicandstyleandkeepthemconsistentwhengeneratingtext

Page 3: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Related work - RNNLM

– InthewordsofBowmanetal.,“AstandardRNNlanguagemodelpredictseachwordofasentenceconditionedonthepreviouswordandanevolvinghiddenstate.”

– Inotherwords,itonlylooksattherelationshipsbetweenconsecutivewords,andsodoesnotcontainorobserveanyglobalfeatures

– Butwhatifwewantglobalinformation?

Page 4: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Other related work

– Skip-thought

– Generatesentencecodesinthestyleofwordembeddings topredictcontextsentences

– Paragraphvector

– Avectorrepresenting theparagraphisincorporatedintosingle-wordembeddings

Page 5: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Autoencoders

– TypicallycomposedoftwoRNNs

– ThefirstRNNencodesasentenceintoanintermediatevector

– ThesecondRNNdecodestheintermediaterepresentationbackintoasentence,ideallythesameastheinput

Page 6: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Variational Autoencoders (VAEs)– Regularautoencoders learnonlydiscretemappingsfrompointtopoint

– However,ifwewanttolearnholisticinformationaboutthestructureofsentences,weneedtobeabletofillsentencespacebetter

– InaVAE,wereplacethehiddenvectorz withaposteriorprobabilitydistributionq(z|x)conditionedontheinput,andsampleourlatentz fromthatdistributionateachstep

– Weensurethatthisdistributionhasatractableformbyenforcingitssimilaritytoadefinedpriordistribution,typicallysomeformofGaussian

Page 7: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Modified loss function

– Theregularautoencoder’s lossfunctionwouldencouragetheVAEtolearnposteriorsasclosetodiscreteaspossible– inotherwords,Gaussiansthatareclusteredextremelytightlyaroundtheirmeans

– Inordertoenforceourposterior’ssimilaritytoawell-formedGaussian,weintroduceaKLdivergencetermintoourloss,asbelow:

Page 8: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Reparameterization trick

– Intheoriginalformulation,theencodernetencodesthesentence intoaprobabilitydistribution(usuallyGaussian);practicallyspeaking,itencodesthesentence intotheparametersofthedistribution(i.e.µandσ)

– However,thisposeschallengesforuswhilebackpropagating:wecan’tbackpropagate overthejumpfromµandσ toz,sinceit’srandom

– Solution:extracttherandomnessfromtheGaussianbyreformulatingitasafunctionofµ,σ,andanotherseparaterandomvariable

FromStackOverflow.

Page 9: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Specific architecture

– Single-layerLSTMforencoderanddecoder

Page 10: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Issues and fixes

– Decodertoostrong,withoutanylimitationsjustdoesn’tusez atall

– Fix:KLannealing

– Fix:worddropout

Page 11: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Experiments – Language modeling– UsedVAEtocreatelanguagemodelsonthePennTreebankdataset,with

RNNLMasbaseline

– Task:trainanLMonthetrainingsetandhaveitdesignatethetestsetashighlyprobable

– RNNLMoutperformedtheVAEinthetraditionalsetting

– However,whenhandicapswereimposedonbothmodels(inputless decoder),theVAEwassignificantlybetterabletoovercomethem

Page 12: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Experiments – Imputing missing words– Task:infermissingwordsinasentencegivensomeknownwords(imputation)

– PlacetheunknownwordsattheendofthesentencefortheRNNLM

– RNNLMandVAEperformedbeamsearch(VAEdecodingbrokenintothreesteps)toproducethemostlikelywordstocompleteasentence

– Preciseevaluationoftheseresultsiscomputationallydifficult

Page 13: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Adversarial evaluation

– Instead,createanadversarialclassifier,trainedtodistinguishrealsentencesfromgeneratedsentences,andscorethemodelonhowwellitfoolstheadversary

– Adversarialerrorisdefinedasthegapbetweenchanceaccuracy(50%)andtherealaccuracyoftheadversary– ideallythiserrorwillbeminimized

Page 14: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Experiments - Other

– SeveralotherexperimentsintheappendixshowedtheVAEtobeapplicabletoavarietyoftasks

– Textclassification

– Paraphrasedetection

– Questionclassification

Page 15: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Analysis

– Worddropout

– Keepratetoolow:sentencestructuresuffers

– Keepratetoohigh:nocreativity,stiflesthevariation

– Effectsoncostfunctioncomponents:

Page 16: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Extras: sampling from the posterior and homotopies– Samplingfromtheposterior:examplesofsentencesadjacentinsentencespace

– Homotopies:linearinterpolationsinsentencespacebetweenthecodesfortwosentences

Page 17: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Even more homotopies

Page 18: Variational Autoencoders Write Poetryllcao.net/cu-deeplearning17/pp/class7_ Elsbeth_Fei-Tzin.pdf · (Generating Sentences from a Continuous Space) ElsbethTurcan and Fei-Tzin Lee Paper

Thanks for listening!

– Anyquestions?