Lecture 16: Dialoguearitter.github.io/courses/5525_slides_v2/lec16-dialogue.pdfFull Dialogue Task...

Post on 06-Feb-2020

0 views 0 download

Transcript of Lecture 16: Dialoguearitter.github.io/courses/5525_slides_v2/lec16-dialogue.pdfFull Dialogue Task...

Lecture16:Dialogue

AlanRi4er(many slides from Greg Durrett)

ThisLecture

‣ Chatbotdialoguesystems

‣ Task-orienteddialogue

‣ OtherdialogueapplicaAons

Chatbots

TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB

TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB

OriginalInterpretaAon:

TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB

A B

C

B B

trainedjudge

OriginalInterpretaAon:

TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB

A B

C

B B

trainedjudge

OriginalInterpretaAon:

TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB

A B

C

B B

trainedjudgeC trainedjudge

OriginalInterpretaAon: StandardInterpretaAon:

TuringTest(1950)‣ ImitaAongame:AandBarelockedinroomsandanswerC’squesAonsviatypewriter.BotharetryingtoactlikeB

A B

C

B B

trainedjudgeC trainedjudge

OriginalInterpretaAon: StandardInterpretaAon:

‣ Thetestisnot“doesthiscomputerseemhuman-liketorandompeoplewithawebbrowser?”

ELIZA‣ Created1964-1966atMIT,heavilyscripted

‣ DOCTORscriptwasmostsuccessful:repeatsuser’sinput,asksinanequesAons

Weizenbaum(1966)

ELIZA

‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule

ELIZA

(.*)you(.*)me

WhydoyouthinkI$2you?

‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule

ELIZA

(.*)you(.*)me

WhydoyouthinkI$2you?

‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule

My(.)(.*)

Whatelsecomestomindwhenyou thinkofyour$1?

ELIZA

(.*)you(.*)me

WhydoyouthinkI$2you?

‣ IdenAfykeyword,idenAfycontext,applytransformaAonrule

‣ Veryli4leneedtogeneratenewcontent,butcanonlyhaveonetypeofconversaAon

My(.)(.*)

Whatelsecomestomindwhenyou thinkofyour$1?

Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006

User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe

Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006

‣ “Nearestneighbors”:humansaysstatementA,findahumanresponseinhuman-humanorhuman-computerchatstostatementA,repeatthat

User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe

Cleverbot‣ Carpenter(1986),onlinesystembuiltin2006

‣ “Nearestneighbors”:humansaysstatementA,findahumanresponseinhuman-humanorhuman-computerchatstostatementA,repeatthat

User:Hitherecleverbot.Cleverbot:Howareyou?User:Doingokay.whatplansdoyouhavetoday?Cleverbot:Idorevision.User:Whatareyourevisingfor?Cleverbot:Mathshistory.User:Howoldareyou?Cleverbot:That'snoneofyourbusiness.User:Whatismathshistory?Cleverbot:Idon’tknow,farmingmaybe

‣ Canofengivesensibleanswers,butthebotdoesn’treallyimposehigh-level discoursestructure

Data-DrivenApproaches‣ CantreatasamachinetranslaAonproblem:“translate”fromcurrentu4erancetonextone

Ri4eretal.(2011)

Data-DrivenApproaches‣ CantreatasamachinetranslaAonproblem:“translate”fromcurrentu4erancetonextone

Ri4eretal.(2011)

Data-DrivenApproaches‣ CantreatasamachinetranslaAonproblem:“translate”fromcurrentu4erancetonextone

‣ Filterthedata,usestaAsAcalmeasurestopruneextractedphrasestogetbe4erperformance

Ri4eretal.(2011)

Data-DrivenApproaches

Ri4eretal.(2011)

Seq2seqmodels

Whatareyoudoing

I

<s>

am going home [STOP]

‣ JustlikeconvenAonalMT,cantrainseq2seqmodelsforthistask

Seq2seqmodels

Whatareyoudoing

I

<s>

am going home [STOP]

‣ JustlikeconvenAonalMT,cantrainseq2seqmodelsforthistask

‣Whymightthismodelperformpoorly?Whatmightitbebadat?

Seq2seqmodels

Whatareyoudoing

I

<s>

am going home [STOP]

‣ JustlikeconvenAonalMT,cantrainseq2seqmodelsforthistask

‣Whymightthismodelperformpoorly?Whatmightitbebadat?

‣ Hardtoevaluate:

LackofDiversity

Lietal.(2016)

‣ Trainingtomaximizelikelihoodgivesasystemthatpreferscommonresponses:

LackofDiversity

Lietal.(2016)

‣ SoluAon:mutualinformaAoncriterion;responseRshouldbepredicAveofuseru4eranceUaswell

LackofDiversity

Lietal.(2016)

‣ SoluAon:mutualinformaAoncriterion;responseRshouldbepredicAveofuseru4eranceUaswell

‣ StandardcondiAonallikelihood: logP (R|U)

LackofDiversity

Lietal.(2016)

‣ SoluAon:mutualinformaAoncriterion;responseRshouldbepredicAveofuseru4eranceUaswell

‣MutualinformaAon:

‣ StandardcondiAonallikelihood: logP (R|U)

logP (R,U)

P (R)P (U)= logP (R|U)� logP (R)

‣ logP(R)canreflectprobabiliAesunderalanguagemodel

LackofDiversity

Lietal.(2016)

‣ OpenSubAtlesdata

Futureofchatbots‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…

Lietal.(2016)Persona…

Futureofchatbots‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…

‣ Canforcechatbotstogiveconsistentanswers,butsAllprobablynotveryinteresAng

Lietal.(2016)Persona…

Futureofchatbots

‣ XiaoIce:MicrosofchatbotinChinese,20Musers,averageuserinteracts60Ames/month

‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…

‣ Canforcechatbotstogiveconsistentanswers,butsAllprobablynotveryinteresAng

Lietal.(2016)Persona…

Futureofchatbots

‣ XiaoIce:MicrosofchatbotinChinese,20Musers,averageuserinteracts60Ames/month

‣ Peopledoseemtoliketalkingtothem…?

‣ HowdeepcanaconversaAonbewithoutmoresemanAcgrounding?Basicfactsaren’tevenconsistent…

‣ Canforcechatbotstogiveconsistentanswers,butsAllprobablynotveryinteresAng

Lietal.(2016)Persona…

Task-OrientedDialogue

Task-OrientedDialogue‣ QuesAonanswering/search:

Task-OrientedDialogue‣ QuesAonanswering/search:

Task-OrientedDialogue

Google,what’sthemostvaluable

Americancompany?

‣ QuesAonanswering/search:

Task-OrientedDialogue

Google,what’sthemostvaluable

Americancompany?

Apple

‣ QuesAonanswering/search:

Task-OrientedDialogue

Google,what’sthemostvaluable

Americancompany?

Apple

WhoisitsCEO?

‣ QuesAonanswering/search:

Task-OrientedDialogue

Google,what’sthemostvaluable

Americancompany?

Apple

WhoisitsCEO?

TimCook

‣ QuesAonanswering/search:

Task-OrientedDialogue‣ Personalassistants/APIfront-ends:

Task-OrientedDialogue

Siri,findmeagoodsushi restaurantinChelsea

‣ Personalassistants/APIfront-ends:

Task-OrientedDialogue

Siri,findmeagoodsushi restaurantinChelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4stars

onGoogle

‣ Personalassistants/APIfront-ends:

Task-OrientedDialogue

Siri,findmeagoodsushi restaurantinChelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4stars

onGoogle

‣ Personalassistants/APIfront-ends:

Howexpensiveisit?

Task-OrientedDialogue

Siri,findmeagoodsushi restaurantinChelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4stars

onGoogle

‣ Personalassistants/APIfront-ends:

Howexpensiveisit?

Entreesarearound$30each

Task-OrientedDialogue

Siri,findmeagoodsushi restaurantinChelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4stars

onGoogle

‣ Personalassistants/APIfront-ends:

Howexpensiveisit?

Entreesarearound$30each

Findmesomethingcheaper

Task-OrientedDialogue‣ Personalassistants/APIfront-ends:

Task-OrientedDialogue

HeyAlexa,whyisn’tmyAmazon orderhere?

‣ Personalassistants/APIfront-ends:

Task-OrientedDialogue

HeyAlexa,whyisn’tmyAmazon orderhere?

Letmeretrieveyourorder. Yourorderwasscheduledtoarrive

at4pmtoday.

‣ Personalassistants/APIfront-ends:

Task-OrientedDialogue

HeyAlexa,whyisn’tmyAmazon orderhere?

Letmeretrieveyourorder. Yourorderwasscheduledtoarrive

at4pmtoday.

‣ Personalassistants/APIfront-ends:

Itnevercame

Task-OrientedDialogue

HeyAlexa,whyisn’tmyAmazon orderhere?

Letmeretrieveyourorder. Yourorderwasscheduledtoarrive

at4pmtoday.

‣ Personalassistants/APIfront-ends:

Itnevercame

Okay,Icanputyouthroughtocustomerservice.

AirTravelInformaAonService(ATIS)‣ Givenanu4erance,predictadomain-specificsemanAcinterpretaAon

DARPA(early1990s),FigurefromTuretal.(2010)

‣ CanformulateassemanAcparsing,butsimpleslot-fillingsoluAons(classifiers)workwelltoo

FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem

Youngetal.(2013)

FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem

Youngetal.(2013)

‣ Dialoguestate:reflectsanyinformaAonabouttheconversaAon(e.g.,searchhistory)

FullDialogueTask‣ Parsing/languageunderstandingisjustonepieceofasystem

Youngetal.(2013)

‣ Dialoguestate:reflectsanyinformaAonabouttheconversaAon(e.g.,searchhistory)

‣ Useru4erance->updatedialoguestate->takeacAon(e.g.,querytherestaurantdatabase)->saysomething

FullDialogueTask

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search()

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

Howexpensiveisit?

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

Howexpensiveisit?get_value(cost, curr_result)

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

Howexpensiveisit?get_value(cost, curr_result)

Entreesarearound$30each

POMDP-basedDialogueSystems

Youngetal.(2013)

‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate

POMDP-basedDialogueSystems

Youngetal.(2013)

‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel

‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate

POMDP-basedDialogueSystems

Youngetal.(2013)

‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel

‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate

‣ Generator:usetemplatesorseq2seqmodel

POMDP-basedDialogueSystems

Youngetal.(2013)

‣ Dialoguemodel:canlooklikeaparseroranykindofencodermodel

‣ POMDP:useristhe“environment,”anu4eranceisanoisysignalofstate

‣ Generator:usetemplatesorseq2seqmodel

‣Wheredorewardscomefrom?

RewardforcompleAngtask?

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

make_reservation(curr_result)

Howexpensiveisit?

+1

OkaymakemeareservaAon!

curr_result <- execute_search()

RewardforcompleAngtask?

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

make_reservation(curr_result)

Howexpensiveisit?

+1

OkaymakemeareservaAon!

curr_result <- execute_search()

Veryindirectsignal ofwhatshould happenuphere

Usergivesreward?

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

Howexpensiveisit?get_value(cost, curr_result)

Entreesarearound$30each

+1

+1

Usergivesreward?

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle

curr_result <- execute_search()

Howexpensiveisit?get_value(cost, curr_result)

Entreesarearound$30each

+1

+1

Howdoestheuserknowtherightsearchhappened?

Wizard-of-Oz

Kelley(early1980s),FordandSmith(1982)

‣ LearningfromdemonstraAons:“wizard”pullstheleversandmakesthedialoguesystemupdateitsstateandtakeacAons

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search(){wizardenters

these

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search(){wizardenters

these

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle{wizardtypesthis

outorinvokes templates

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search(){wizardenters

these

SushiSekiChelseaisasushirestaurantinChelseawith4.4starsonGoogle{wizardtypesthis

outorinvokes templates

‣Wizardcanbeatrainedexpertandknowexactlywhatthedialoguesystemsissupposedtodo

LearningfromStaAcTraces

Bordesetal.(2017)

‣ Usingeitherwizard-of-OzorotherannotaAons,cancollectstaActracesandtrainfromthese

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search()

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search()stars <- 4+

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search()

‣ Useraskedfora“good”restaurant—doesthatmeanweshouldfilterbystarraAng?Whatdoes“good”mean?

stars <- 4+

FullDialogueTask

FindmeagoodsushirestaurantinChelsea

restaurant_type <- sushi

location <- Chelsea

curr_result <- execute_search()

‣ Useraskedfora“good”restaurant—doesthatmeanweshouldfilterbystarraAng?Whatdoes“good”mean?

‣ HardtochangesystembehavioriftrainingfromstaActraces,especiallyifsystemcapabiliAesordesiredbehaviorchange

stars <- 4+

Goal-orientedDialogue

‣ BigCompanies:AppleSiri(VocalIQ),GoogleAllo,AmazonAlexa,MicrosofCortana,FacebookM,SamsungBixby,TencentWeChat

‣ Startups:

‣ Lotsofcoolworkthat’snotpublicyet

‣ Tonsofindustryinterest!

OtherDialogueApplicaAons

Search/QAasDialogue

‣ “HasChrisPra4wonanOscar?”/“HashewonanOscar”

QAasDialogue‣ DialogueisaverynaturalwaytofindinformaAonfromasearchengineoraQAsystem

Iyyeretal.(2017)

QAasDialogue‣ DialogueisaverynaturalwaytofindinformaAonfromasearchengineoraQAsystem

Iyyeretal.(2017)

‣ QAishardenoughonitsown

‣ Challenges:

QAasDialogue‣ DialogueisaverynaturalwaytofindinformaAonfromasearchengineoraQAsystem

Iyyeretal.(2017)

‣ QAishardenoughonitsown

‣ Usersmovethegoalposts

‣ Challenges:

QAasDialogue‣ UWQuACdataset:QuesAonAnsweringinContext

Choietal.(2018)

SearchasDialogue

‣ Googlecandealwithmisspellings,somoremisspellingshappen—Googlehastodomore!

DialogueMissionCreep

System

Erroranalysis

Be4ermodel

Data

MostNLPtasks

DialogueMissionCreep

System

Erroranalysis

Be4ermodel

‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0

Data

MostNLPtasks

DialogueMissionCreep

System

Erroranalysis

Be4ermodel

‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0

Data

MostNLPtasks

System

Erroranalysis

Be4ermodel

Data

Dialogue/Search/QA

DialogueMissionCreep

System

Erroranalysis

Be4ermodel

‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0

DataHarderData

MostNLPtasks

System

Erroranalysis

Be4ermodel

Data

Dialogue/Search/QA

???

DialogueMissionCreep

System

Erroranalysis

Be4ermodel

‣ FixeddistribuAon(e.g.,naturallanguagesentences),errorrate->0

Data

‣ Errorrate->???;“missioncreep”fromHCIelement

HarderData

MostNLPtasks

System

Erroranalysis

Be4ermodel

Data

Dialogue/Search/QA

???

DialogueMissionCreep

‣ Highvisibility—yourproducthastoworkreallywell!

Takeaways

‣ Somedecentchatbots,applicaAons:predicAvetextinput,…

‣ Task-orienteddialoguesystemsaregrowinginscopeandcomplexity

‣Moreandmoreproblemsarebeingformulatedasdialogue—interesAngapplicaAonsbutchallengingtogetworkingwell