Language Model as an Annotator: Exploring DialoGPTfor ...

Language Model as an Annotator: Exploring DialoGPT for Dialogue Summarization

Xiachong Feng1, Xiaocheng Feng1,2, Libo Qin1, Bing Qin1,2, Ting Liu1,2

1 Harbin Institute of Technology, 2 Peng Cheng Laboratory

Dialogue Summarization• Dialoguesummarizationaimstogenerateasuccinctsummarywhileretainingessentialinformationofthedialogue.

Remember we are seeing the wedding planner after workSure, where are we meeting her? At Nonna Rita’s I want to order seafood tagliatelle Haha why notWe remmber spaghetti pomodoro disaster from our last meetingOmg it was over her white blouse:D:P

Blair and Chuck are going to meet the wedding planner after work at Nonna Rita’s. The tagliatelle served at Nonna Rita’s are very good.

Blair:

Chuck:Blair:Chuck: Blair:Chuck:

Blair:Chuck:Blair:

Dialogue

Summary

A Good Summary?

Informativeness Redundancy Relevance

Peyrard (2019):agoodsummaryisintuitivelyrelatedtothreeaspects

Related Works• Forinformativeness• Linguisticallyspecificwords• Domainterminologies• Topicwords

• Forredundancy• Similarity-basedmethodstoannotateredundantutterances

• For relevance• Topicsegmentation

Problems• Reliedonhumanannotations.

• labor-consuming

• Obtainedviaopen-domaintoolkits

• Dialogueagnostic

• notsuitablefordialogues

Pre-trained Language Models

DialoGPT Annotator

Overview• KeywordsExtraction:Extractsunpredictablewordsaskeywords.• TopicSegmentation:Insertsatopicsegmentationpointbeforeoneutteranceifitisunpredictable.• RedundancyDetection:Detectsutterancesthatareuselessforcontextrepresentationasredundant.

Keywords Extraction: DialoGPTKE• Motivation:ifonewordinthegoldenresponseisdifficulttobeinferredfromDialoGPT,weassumethatitcontainshighinformationandcanbeviewedasakeyword.• Extractsunpredictablewordsaskeywords.

Context: yoo guys. EOS1Response: hey wassup. EOS2Context: hey wassup. EOS2Response: Remmber the meeting EOS3Context: Remmber the meeting EOS3Response: I almost forget it. EOS4Context: I almost forget it. EOS4Response: fine EOS5Context: fine EOS5Response: Where? EOS6Context: Where? EOS6Response: at Barbara's place. EOS7

Tom: yoo guys. EOS1John: hey wassup. EOS2Tom: Remmber the meeting EOS3John: I almost forget it. EOS4Tom: fine EOS5John: Where? EOS6Tom: at Barbara's place. EOS7

Context-response Pairs

Original Dialogue

DialoGPT

Remmber

wassup.

the meeting

Remmber the meeting

Golden:

loss31

Word-level and Utterance-level Loss

Prediction:Keywords Extraction

loss3loss32 loss33 loss34... Extracted

KeywordsAvg

loss31 loss32 loss33 loss34

loss71 loss72 loss73 loss34

Topic Segmentation: DialoGPTTS• Motivation:iftheresponseisdifficulttobepredictedgiventhecontextbasedonDialoGPT,weassumetheresponsemaybelongtoanothertopicandthereisatopicsegmentationbetweenthecontextandresponse.• Insertsatopicsegmentationpointbeforeoneutteranceifitisunpredictable.

Context: yoo guys. EOS1Response: hey wassup. EOS2Context: hey wassup. EOS2Response: Remmber the meeting EOS3Context: Remmber the meeting EOS3Response: I almost forget it. EOS4Context: I almost forget it. EOS4Response: fine EOS5Context: fine EOS5Response: Where? EOS6Context: Where? EOS6Response: at Barbara's place. EOS7

Context-response Pairs

Original Dialogue

DialoGPT

Remmber

wassup.

the meeting

Remmber the meeting

Golden:

loss31

Word-level and Utterance-level Loss

Prediction:

Topic Segmentation

SegmentationPoint

loss32 loss33 loss34Avg

loss2 loss3 loss4 loss5 loss6 loss7

SegmentationPoint

Redundancy Detection: DialoGPTRD• Motivation:If oneutterancebringsbringslittleinformationandhassmalleffectsonpredictingtheresponse,thisutterancebecomesaredundantutterance.• Detectsutterancesthatareuselessforcontextrepresentationasredundant.

Dialogue Sequence

DialoGPT

yoo guys. EOS1 hey wassup. EOS2Remmber the meeting EOS3 I almostforget it. EOS4 fine EOS5 Where? EOS6 at Barbara's place. EOS7

Original Dialogue

yoo guys. EOS1... at Barbara's place. EOS7

...Dialogue Context

Representation

𝒉𝑬𝑶𝑺𝟏 𝒉𝑬𝑶𝑺𝟐 𝒉𝑬𝑶𝑺𝟕

SimilarityCalculation

Step: 1

Step: 2

Step: 3

Step: 4

Step: 50.573

Step: 60.993

Step: 0

Redundantutterances

ℎ!"#!ℎ!"#" ℎ!"## ℎ!"#$ ℎ!"#% ℎ!"#& ℎ!"#'

Annotation Tags

Summarizer

BARTPre-trained PGN

Nonpre-trained

Dataset and Metrics• Datasets• SAMSum• AMI

• Evaluation Metrics• ROUGE• BERTScore

Statistics for SAMSum and AMI datasets

Automatic Evaluation

15TestsetresultsontheSAMSum dataset BERTScore

TestsetresultsontheAMIdataset

Human Evaluation

modelcangetthebestscoreinconcisenessmodelcanperform

betterincoverage

Effect of DialoGPTKE

• Entitiesplayanimportantroleinthesummarygeneration.

• CombinedwithDialoGPT embeddings,KeyBERT cangetbetterresults.

Intrinsic Evaluation For Keywords

• Viewreferencesummarywordsasgoldenkeywords

• BothTextRank andEntitiesperformpoorlyinrecall

• Ourmethodcanextractmorediversekeywords.

Effect of DialoGPTRD

• Rule-basedmethod:annotatesutteranceswithoutnoun,verbandadjectiveasredundant.

• OurmethodshowsmoreadvantagesforlongandverbosemeetingtranscriptsintheAMI.

Effect of DialoGPTTS

• OurmethodcangetcomparableresultswiththestrongbaselineC99(w/DialoGPT emb).

Ablation Studies for Annotations

• Forbothdatasets,trainingsummarizersbasedon datasetswithtwoofthreeannotationscansurpasscorrespondingsummarizersthataretrainedbasedondatasetswithonetypeofannotation.

• SummarizersthataretrainedonDKE+TS stillgetimprovementsonbothdatasets.

Conclusion• WeinvestigatetouseDialoGPT asunsupervisedannotatorsfordialoguesummarization,includingkeywordsextraction,redundancydetectionandtopicsegmentation.

• Experimentalresultsshowthatourmethodconsistentlyobtainsimprovementsuponpre-traind summarizer(BART)andnonpre-trainedsummarizer(PGN)onbothdatasets.

• Combiningallthreeannotations,oursummarizercanachievenewstate-of-the-artperformanceontheSAMSum dataset.

Thanks!

Language Model as an Annotator: Exploring DialoGPTfor ...

Documents

Transcript of Language Model as an Annotator: Exploring DialoGPTfor ...

Recurrent Image Annotator

FOREIGN LANGUAGE ENROLLMENT ATTRITION: EXPLORING …

Annotator Rationales for Visual Recognitionvision.cs.utexas.edu/projects/rationales/rationales.pdf · Annotator Rationales for Visual Recognition Jeff Donahue and Kristen Grauman

IPEVO Annotator - IPEVO (United States)

Exploring Language

User Guide for ELAN Linguistic Annotator

MATILDA: Multi-AnnoTator multi-language Interactive Light ...

SACODEYL Annotator

EXPLORING FIGURATIVE LANGUAGE PROCESSING IN …

Using the Autogen annotator

Exploring Language Skills through Literature

EUDICO Linguistic Annotator (ELAN) version 1.4 Manual … · 2003-08-15 · EUDICO Linguistic Annotator (ELAN) version 1.4 Manual ... ELAN (EUDICO Linguistic Annotator) is an annotation

Stellar Optical Photography Annotator - Stanford Universityyj296hj2790/Wilson_Stellar... · Stellar Optical Photography Annotator Robert Wilson Department of Electrical Engineering

UNIT 1 EXPLORING LANGUAGE Section A Analysing Language ...

Exploring Language Language Under the ... - ttsonline.net

Exploring Classroom Discourse - Routledgecw.routledge.com/textbooks/rial/data/9780415570671_sample.pdf · Exploring English Language Teaching ... Exploring Classroom Discourse Language

Exploring English Language Teachingcw.routledge.com/textbooks/rial/data/9780415584159_sample.pdf · Exploring English Language Teaching Routledge Introductions to Applied Linguisticsis

Exploring metaprogramming using Ruby language

Exploring Film Language

Les nouveautés de PDF Annotator 8...Les nouveautés de Nouveau ! PDF Annotator 64-bit Véritable application 64 bits, PDF Annotator peut désormais accéder à beaucoup plus de mémoire