Sequence to Sequence Learning · joint work with many Google Brain collaborators Sequence to...

Post on 05-Jul-2020

8 views 0 download

Transcript of Sequence to Sequence Learning · joint work with many Google Brain collaborators Sequence to...

Quoc V. Le

joint work with many Google Brain collaborators

Sequence to Sequence Learning

brainvision summit

0

300

600

900

1200

Q1-2012 Q3-2012 Q1-2013 Q3-2013 Q1-2014 Q3-2014 Q1-2015 Q3-2015

Projects use GoogleBrain

Time

0

300

600

900

1200

Q1-2012 Q3-2012 Q1-2013 Q3-2013 Q1-2014 Q3-2014 Q1-2015 Q3-2015

Significant speech & vision improvements

Sequence to sequence models

TensorFlow launched

SmartReply

RankBrain

Directories use

GoogleBrain

Time

Discovery of Cat Detector on YouTube

Google Photos

Word2Vec

0

300

600

900

1200

Q1-2012 Q3-2012 Q1-2013 Q3-2013 Q1-2014 Q3-2014 Q1-2015 Q3-2015

Significant speech & vision improvements

TensorFlow launched

SmartReply

RankBrain

Directories use

GoogleBrain

Time

Discovery of Cat Detector on YouTube

Google Photos

Word2Vec

Sequence to sequence models

Machines that understand natural language

Step 1: Understand words

Machines that understand natural language

Step 1: Understand words

Step 2: Understand strings of words

Machines that understand natural language

Step 1: Understand words

Step 2: Understand strings of words

a aa dogcat

. . .

zzz

100 dimensions . . .

the

. . .

sat

. . .. . .

aardvark

the cat sat on the mat

a aa dogcat

. . .

zzz

100 dimensions . . .

the

. . .

sat

. . .. . .

lovelikeadoreking

queenprincess

prince

saturdaysundaymonday

manwoman

googlefacebook microsoft

ibm

hellohi

goodbye

englandfrance

russiachina

smallsmaller

bigbigger

first principal component

lovelikeadoreking

queenprincess

prince

saturdaysundaymonday

manwoman

googlefacebook microsoft

ibm

hellohi

goodbye

englandfrance

russiachina

smallsmaller

bigbigger

first principal component

lovelikeadoreking

queenprincess

prince

saturdaysundaymonday

manwoman

googlefacebook microsoft

ibm

hellohi

goodbye

englandfrance

russiachina

smallsmaller

bigbigger

first principal component

four

five

three

two

one

English word vectors for numbers

four

five

three

two

one

cuatro

cinco

tres

dos

uno

Spanish word vectors for numbers

English word vectors for numbers

horsecow dog

cat

pig

caballovaca

cerdo

gato

perro

Spanish word vectors for animals

English word vectors for animals

horsecow dog

cat

pig

caballovaca

cerdo

gato

perro

Spanish word vectors for animals

English word vectors for animals

Mikolov, Le, Sutskever Exploiting Similarities among Languages for

Machine Translation. arXiv, 2013

Machines that understand natural language

Step 1: Understand words

Step 2: Understand strings of words

i love music

i like mathematics

i enjoy computer science

me encanta la musica

Spanish “sentence” vectorsEnglish “sentence” vectors

me gustan las matemáticas

me gusta disfrutar de la informática

i

me encanta la musica

<start> me encanta lalove music musica

<stop>

i

me encanta la musica

<start> me encanta lalove music musica

<stop>

“end-to-end training”Sutskever, Vinyals, Le. Sequence to Sequence Learning

with Neural Networks. NIPS, 2014. Related Approach: Cho & Bengio. Encoder-Decoder RNNs.

i

brainvision summit

lovemusic

<start>

meencantala

musica<stop>

i

me encanta la musica

<start> me encanta lalove music musica

<stop>

Mary admires John

Mary is in love with John

Mary respects John

Mary admires John

Mary is in love with John

Mary respects John

John admires Mary

John is in love with Mary

John respects Mary

Quoc: Hello, I have a problem with my machine Machine: Hi, this is Shiva. How may I assist you today? Quoc: I want to access using VPN Machine: Currently, the machine is connected to corporate network? Quoc: No Machine: Check out this solution <URL> Quoc: Thanks , bye! Machine: Thank you.

Quoc: What is the purpose of life? Machine: To serve the greater good.

Quoc: What is the purpose of living? Machine: To live forever.

Quoc: What is the purpose of dying? Machine: To have a life.

Vinyals, Le. A Neural Conversational Model. ICML Deep Learning Workshop, 2015.

SmartReply

Machines that understand natural language

Step 1: Understand words

Step 2: Understand strings of words

Step 3: Have memory, logical reasoning

Advances in Memory and Logical Reasoning

“Soft” Attention Mechanism (Bahdanau, Cho, Bengio, 2014)

Augmented Memory (Graves et al, 2014; Weston et al, 2014)

Augmented Logic and Arithmetic Components (Neelakatan, Le, Sutskever, 2016)

i

me encanta la musica

<start> me encanta lalove music musica

<end>

i

me

<start>love music

i

me

<start>love music

i

me

<start>love music

i

me

<start>love music

Bahdanau, Cho, Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR, 2015.

i

me

<start>love musicExternal Memory

Graves, Wayne, Danihelka. Neural Turing Machines. ArXiv, 2014 Sukhabaatar, Szlam, Weston, Fergus.

End-to-end memory networks. NIPS, 2015.

i

me

<start>love musicExternal Memory

Arithmetic& Logic

Functions

Neelakantan, Le, Sutskever. Neural Programmer: Inducing Latent Programs with Gradient Descent. ICLR, 2016.

add(a,b)subtract(a,b)multiply(a,b)

i

me

<start>love musicExternal Memory

Arithmetic& Logic

Functions

Neelakantan, Le, Sutskever. Neural Programmer: Inducing Latent Programs with Gradient Descent. ICLR, 2016.

add(a,b)subtract(a,b)multiply(a,b)

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

Machine: 5 apples.

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

2

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

2,3

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b)subtract(a,b)multiply(a,b)

2,3

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b) = 5subtract(a,b) = -1multiply(a,b) = 6

2,3

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b) = 5 * w1subtract(a,b) = -1 * w2multiply(a,b) = 6 * w3

2,3

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b) = 5 * w1subtract(a,b) = -1 * w2multiply(a,b) = 6 * w3

2,3

brainvision summit

External Memory

Arithmetic & LogicFunctions

add(a,b) = 5 * w1subtract(a,b) = -1 * w2multiply(a,b) = 6 * w3

2,3

Quoc: I have 2 apples and Tom gives me 3 apples. How many apples do I have?

Machine: 5 apples.

Machines that understand natural language

Step 1: Understand words

Step 2: Understand strings of words

Step 3: Have memory, logical reasoning

Step 4: Learn from many tasks

English (unsupervised)

German (translation)

Tags (parsing)English

English (unsupervised)

German (translation)

Tags (parsing)English

One-to-many

English (unsupervised)

German (translation)

Tags (parsing)English

One-to-many

English (unsupervised)

Image (captioning) English

German (translation)

Many-to-one

German (translation)

English (unsupervised) German (unsupervised)

English

Many-to-many

Able to improve accuracies of all related tasks

English (unsupervised)

German (translation)

Tags (parsing)English

One-to-many

English (unsupervised)

Image (captioning) English

German (translation)

Many-to-one

German (translation)

English (unsupervised) German (unsupervised)

English

Many-to-many

Dai, Le. Semi-supervised Sequence Learning. NIPS, 2015. Luong et al. Multitask Sequence Learning. ICLR, 2016

Able to improve accuracies of all related tasks

Machines that understand natural language

Step 1: Understand words

Step 2: Understand strings of words

Step 3: Have memory, logical reasoning

Step 4: Learn from many tasks