Daniel Shank, Data Scientist, Talla at MLconf SF 2017

Getting Value Out of Chat DataWHAT TO DO WHEN YOUR DATA IS NOISY, SPARSE, AND SHORT

Introduction

Contact: daniel@talla.com

NLP for internal business use cases

Smart knowledge management

Hiring!

What is “Chat data?”

USER2: USER3 do you have new new cal on your Talla account already? Looks like it’s not available for me yet. Would be nice if we could also get inbox support enabled since it’s so much better than gmail. cc USER1USER3: USER2 I realized that after I typed this that I was using my personal gmail when I updated to the new changes. I looked on Talla and I didn’t see the same option to update to new calendar yet.USER4: USER2 I just enabled Inbox for our domainUSER4: new calendar is set to letting google decide when to roll it out, but it looks like we can also enable it as an option nowUSER4: I've now set that to be available as well. These may take some time to show upUSER1: USER2 its been enabled for awhile.USER1: (inbox)USER1: and the new calendar is enabled, soon as google decides you are allowed to have it.USER2: Thanks USER1 USER4

Things similar to chat data

Sequential interactions

Forum posts

Some email

IT ticketing system interactions

Short text

Associated with a user

Possibly directed at another user

Highly context dependent

Problems with chat

Increasing number of data sources

In theory contains lots of valuable information

In practice data is unlabeled

“Water, water, everywhere, but not a drop to drink.”

Goal: Issue detection and matching

People get help through chat platforms

Extract that data and automate the process

USER1’s interaction should help USER3!

USER1: Hi, does anyone know if we have patriot’s day off?USER2: Yeah USER1, we do.USER1: Thanks! …USER3: Hey, do we get patriot’s day off?

Automating knowledge delivery

Find issues or questions that people have

Match new issues to pre-existing ones

Serve the appropriate response or answer

Extracting answers is very hard

Focus on matching and search

Overview

Jumpstart ML: Active Learning

Topic modeling

Dimensionality Reduction and Representations

Find questions and analyze

Use patterns to find questions

Has ‘?’ token

Has a question word

Not too hard

Good start for finding past issues

Problems with extracted questions

Most questions need context to understand. e.g.:

“What is it?”

”Can I use her personal email?”

Intent varies:

Want information

Do this thing for me

Only some questions make sense out of context

“Who is she?” “What is that?” “Will that fix my computer?”

Anaphora—it, that

Pronouns—He, she, etc

“What day is it?”, “Where am I?”

Answer depends on time, person asking

Requires more involved data model

Questions have different intents

“Performative” – Please help me? ex:

hi can you please help me reset my 2 factor authentication on salesforce?

“Informational” – What is it?

what's the pl code?

“Navigational” – How do I do this?

how do i record a vidyo meeting?

Can we write special case rules?

Borderline cases

is there a way to find out the size of an hbase table? – User asks “Is there (a way…)” to get directions

can anyone tell me where i find the out of stock request report? –User asks someone to give them information

Many variants

Alternative is to label data and use supervised learning

We want to label data, but…

Managing crowdworkers:

Expensive

Time consuming

Can’t be used unless data is safely anonymous

Will the model work afterwards?

Active Learning makes labeling more efficient

More value for your time

Can use with crowd workers or without

Good for chat:

Models train fast

Quick to annotate

Supervised learning with little labeled data

Annotate

Train/Predict Get data

How it works (roughly)

Annotate 𝐷0 ∈ 𝐷

Train your model on 𝐷0

Predict labels on remaining data (𝐷 − 𝐷0)

Choose more data, 𝐷1 ∈ 𝐷 − 𝐷0,

Choice of 𝐷1 is based on label predictions

Repeat

Profit!

Annotate

Train/Predict Get data

Where we are

Topic modeling

Daniel Shank, Data Scientist, Talla at MLconf SF 2017

Technology

Transcript of Daniel Shank, Data Scientist, Talla at MLconf SF 2017

Xia Zhu – Intel at MLconf ATL

Daniel Shank, Data Scientist, Talla at MLconf SF 2016

Talla en madera

Conjunto de Verano: Blusa Off Shoulder + Short · Talla S Talla M Talla L VOLADO SUPERIOR BLUSA OFF SHOULDER HOJA 2

MLconf NYC Animashree Anandkumar

MLConf Seattle 2015 - ML@Quora

CATALOGO ISACCO 2014 TALLA UNICA

Scott Triglia, MLconf 2013

TRAJE DE BAÑO DE NIÑA MOD2 TALLA 2-4-8.000unregaloforyou.com/images/catalogo babies.pdf · body boy gris talla 18m11.000.000 marca carters body boy lepa talla 24m-11.000.000 marca

Byron Galbraith, Chief Data Scientist, Talla, at MLconf NYC 2017

MLconf NYC Xiangrui Meng

Copiadora de Talla

Catalogo Talla Imagineria

Decubito, facies, peso y talla

ReviewAnalysis MLconf 2016 JPrendki

Malla Talla Nathuakhan - ADB

TALLA 14 MANGA CORTA · 2020. 6. 23. · arrow quetzaltenango c.c. “paseo las americas” talla 14 manga corta talla 14.5 manga corta . precio q. 170.00 c/u

MLconf NYC Samantha Kleinberg

HAPPY BIRTHDAY LORENZO TALLA

Catalogo Issaco Talla Unica 2015