Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

29
Natural Language Understanding @ Facebook Scale ENGINEERING MANAGER, FB Rushin Shah

Transcript of Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Page 1: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Natural Language Understanding@ Facebook Scale

ENGINEERING MANAGER, FB

Rushin Shah

Page 2: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Our Goal:Understand textual content with near human accuracy at Facebook scale

Page 3: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Friends ML Conferences

¬

HockeyThe Daily Puck

The Canadiens ran through a high-intensity practice on Wednesday ahead of Thursday's game.

1 hr •

Ron Timpany, Dan Fell and 80 others

Like Comment Share

Montreal Hockey Insider

Canadiens on verge of clinching playoff spot, Lindgren called up.

1 hr •

Bill Russell, Joe Tony and 116 others

Like Comment Share

Benoit Dumoulin was watching Vancouver Canucks vs. Montreal Canadiens.

Yes! In OT!

1 hr •

Bill Russell, Joe Tony and 234 others

Like Comment Share

VAN 3 FINAL OT 4 MTL

Page 4: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

NLU Tasks

• Text classification

• Word classification

• Content similarity

• Entity Resolution

Page 5: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Text Classification

Entity: Delicious Food

Jole Simmons1 hr •

Sarah Russell and 23 others 4 Comments

Like Comment Share

Topic: Cooking

I'm trying out this new recipe for a coconut curry tonight. It looks DELICIOUS!!!

Entity: Kaepernick

Jole Simmons1 hr •

Sarah Russell and 23 others 4 Comments

Like Comment Share

Topic: Sports or Cooking ?

Last night's game was absolutely incredible. Once Curry gets cooking, there's no stopping that guy!

Page 6: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Word Classification

Page 7: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Content similarity

SIMILARITY0.75

Page 8: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

MUCKBUCKET SUNSHINEMUSIC BAND

Parsing out Entities"DID YOU KNOW THAT MUCK BUCKET

SUNSHINE IS PERFORMING LIVE AT THE BOOM BOOM ROOM?"

TOPICLIVE EVENT

The Boom Boom Room1 hr •

Tonight before you Dance Dance Dance, join us for a special evening. Boom Boom Room Presents: Muckbucket Sunshine! Doors 8pm.

Sarah Russell and 23 others 4 Comments

Like Comment Share

Page 9: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

• Deep Learning For NLU

• ContinuousRepresentation

• Can Solve Hard NLP Problems

Natural Language Processing (Almost) From Scratch

Page 10: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText

Page 11: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Features

Multiple Tasks Multiple Languages Multiple Architectures

Page 12: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Model Structure Learning Algorithm

Data Loader Tokenizer Feature Extraction

Model Training DeploymentFeat Prep

DeepText Platform architecture

Page 13: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Tasks

Sequence LabelingClassification

Page 14: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Document classification

Page 15: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Document classification - LSTM

Hidden

MLP

Messi todayscored

HiddenHidden

EmbeddingEmbeddingEmbedding

Page 16: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Document classification - CNN

Messi todayscored

EmbeddingEmbeddingEmbedding

MLP

CNN

Page 17: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Classification: FastText

Messi todayscored

Unigram Bigram Unigram Bigram Unigram

Output

Hidden

Page 18: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Word classification

Page 19: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

MLP

CNN/LSTM

Messi

Embedding

today

Embedding

Classification(Messi)

MLP

Classification(today)

scored a hat-trick

DeepText Classification: Word classification

Page 20: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

DeepText Content Similarity

Messi todayscored

EmbeddingEmbeddingEmbedding

CNN / LSTM

Messi todayscored

EmbeddingEmbeddingEmbedding

CNN / LSTM

MLP

Ranking loss

Page 21: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

FC Barcelona

DeepText Entity recognition and linking

Real Madrid C.F.

Page 22: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Document

Candidate Selection

Mention Detection Disambiguation Entity

Annotations

Entity recognition and linking Architecture

Page 23: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Exploring Use Cases In Facebook

Page 24: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

For Sale

Post for SaleNot Selling

Create a Sales post to sell your items faster. Only post as a discussion if you’re notselling something.

Are You Selling Something?

Page 25: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Social Recs

Page 26: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

Scale MLExperts

Reuse Models

Optimize Labels

CLUE

Page 27: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

CLUE

Label Efficiency

Active Learning

Democratize

Single entry point for NLU

Flexibility and Scale

Deep Text

Scale MLExperts

Reuse Models

Optimize Labels

Page 28: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

ActiveLearning

Self Service With CLUE

TrainClassifier

LabelData

CollectData

(Search)

Review

Threshold

Prec

isio

n

POOLING

...

W1 W2 W3 W4 WN

CONVOLUTIO

I want some subway or burger king

I want some subway or burger kingSlots

[{

"name": "cu:restaurant""value": "subway""start": 12"end": 19"contextldx": 0

}{

"name": "cu:restaurant""value": "burger king""start": 22"end": 32"contextldx": 0

Page 29: Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017

MODELS

200+