PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the...

36
SURF 2019 Jaehoon Cha Electrical and Electronics Engineering

Transcript of PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the...

Page 1: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

SURF 2019

Jaehoon ChaElectrical and Electronics Engineering

Page 2: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Content

• What you can do

• Recurrent Neural Networks

• Hierarchical Neural Network Architecture

Page 3: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

What you can do

Page 4: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

21 3

1

2

3

4 𝑥

𝑦

What machines do

Page 5: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

What machines do

21 3

1

2

3

4 𝑥

𝑦

4

Page 6: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

What machines do

21 3

1

2

3

4 𝑥

𝑦

4

21 3

1

2

3

4 𝑥

𝑦

?

Page 7: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

What machines do

What is a f here?

e.g., • temperature, humidity, the amount of cloud

-> precipitation• population, the number of shops, the number of new residents

-> housing price

𝑡𝑎𝑟𝑔𝑒𝑡 = 𝑓(𝑓𝑒𝑎𝑡𝑢𝑟𝑒1, 𝑓𝑒𝑎𝑡𝑢𝑟𝑒2, 𝑓𝑒𝑎𝑡𝑢𝑟𝑒3)

Page 8: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

What machines do

𝑡𝑎𝑟𝑔𝑒𝑡 = 𝑓(𝑓𝑒𝑎𝑡𝑢𝑟𝑒1, 𝑓𝑒𝑎𝑡𝑢𝑟𝑒2, 𝑓𝑒𝑎𝑡𝑢𝑟𝑒3)

It is more difficult to find a function in real life

?

e.g., • temperature, humidity, the amount of cloud

-> precipitation• population, the number of shops, the number of new residents

-> housing price

Page 9: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Machines can find the function

Page 10: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Machines can find the function

Page 11: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Machines can find the function

Page 12: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Machines can find the function

Page 13: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Machine learning can be used to analyze data

Page 14: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Recurrent Neural Network

Page 15: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Recurrent Neural Networks (RNNs)

• Recurrent Neural Networks (RNNs) is designed for time series data.

• The networks have loops so that they consider time dependencies between elements in the time series data.

Page 16: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Recurrent Neural Networks (RNNs)

𝑓hell o

𝑓

h

𝑓

e

𝑓

l

𝑓

l

𝑠1 𝑠2 𝑠3o

Conventional Neural Networks:

Recurrent Neural Networks:

Page 17: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Recurrent Neural Networks (RNNs)

Vanilla RNN Cell

𝑠𝑡−1 𝑠𝑡

𝑥𝑡

𝑠𝑡 = 𝑓(𝑊𝑥𝑡 + 𝑈𝑠𝑡−1 + 𝑏)

Page 18: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Recurrent Neural Networks (RNNs)What if words are so long?e.g., Pneumonoultramicroscopicsilicovolcanoconiosis: lung disease caused by micro dust

𝑓

P

𝑓

n

𝑓

e

⋯𝑠1 𝑠2 𝑠3

𝑓

i

𝑠43s

Page 19: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Long Short Term Memory (LSTM)

𝑓𝑡 = 𝜎(𝑊𝑓𝑥𝑡 + 𝑈𝑓ℎ𝑡−1 + 𝑏𝑓)

𝑖𝑡 = 𝜎(𝑊𝑖𝑥𝑡 + 𝑈𝑖ℎ𝑡−1 + 𝑏𝑖)

𝑜𝑡 = 𝜎(𝑊𝑜𝑥𝑡 + 𝑈𝑜ℎ𝑡−1 + 𝑏𝑜)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎ℎ𝑡−1 + 𝑏𝑎)

𝑥𝑡

𝐶𝑡−1

𝜎 𝑡𝑎𝑛ℎ

×

𝜎

× +

𝜎

×

𝑡𝑎𝑛ℎ

ℎ𝑡−1

𝐶𝑡

ℎ𝑡

𝑓𝑡 𝑖𝑡 𝑎𝑡𝑜𝑡

Page 20: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Long Short Term Memory (LSTM)

𝑓𝑡 = 𝜎(𝑊𝑓𝑥𝑡 + 𝑈𝑓ℎ𝑡−1 + 𝑏𝑓)

𝑖𝑡 = 𝜎(𝑊𝑖𝑥𝑡 + 𝑈𝑖ℎ𝑡−1 + 𝑏𝑖)

𝑜𝑡 = 𝜎(𝑊𝑜𝑥𝑡 + 𝑈𝑜ℎ𝑡−1 + 𝑏𝑜)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎ℎ𝑡−1 + 𝑏𝑎)

𝐶𝑡 = 𝑎𝑡⨀𝑖𝑡 + 𝑓𝑡⨀𝐶𝑡−1ℎ𝑡 = tanh(𝐶𝑡)⨀𝑜𝑡

𝑥𝑡

𝐶𝑡−1

𝜎 𝑡𝑎𝑛ℎ

×

𝜎

× +

𝜎

×

𝑡𝑎𝑛ℎ

ℎ𝑡−1

𝐶𝑡

ℎ𝑡

𝑓𝑡 𝑖𝑡 𝑎𝑡𝑜𝑡

Page 21: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Long Short Term Memory (LSTM)

𝑓𝑡 = 𝜎(𝑊𝑓𝑥𝑡 + 𝑈𝑓ℎ𝑡−1 + 𝑏𝑓)

𝑖𝑡 = 𝜎(𝑊𝑖𝑥𝑡 + 𝑈𝑖ℎ𝑡−1 + 𝑏𝑖)

𝑜𝑡 = 𝜎(𝑊𝑜𝑥𝑡 + 𝑈𝑜ℎ𝑡−1 + 𝑏𝑜)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎ℎ𝑡−1 + 𝑏𝑎)

𝐶𝑡 = 𝑎𝑡⨀𝑖𝑡 + 𝑓𝑡⨀𝐶𝑡−1ℎ𝑡 = tanh(𝐶𝑡)⨀𝑜𝑡

Q. 0 ≤ 𝜎 𝑥 ≤ 1, −1 ≤ 𝑡𝑎𝑛ℎ 𝑥 ≤ 1What if 𝜎 is used?

𝑥𝑡

𝐶𝑡−1

𝜎 𝑡𝑎𝑛ℎ

×

𝜎

× +

𝜎

×

𝑡𝑎𝑛ℎ

ℎ𝑡−1

𝐶𝑡

ℎ𝑡

𝑓𝑡 𝑖𝑡 𝑎𝑡𝑜𝑡

Page 22: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Long Short Term Memory (LSTM)

𝑓𝑡 = 𝜎(𝑊𝑓𝑥𝑡 + 𝑈𝑓ℎ𝑡−1 + 𝑏𝑓)

𝑖𝑡 = 𝜎(𝑊𝑖𝑥𝑡 + 𝑈𝑖ℎ𝑡−1 + 𝑏𝑖)

𝑜𝑡 = 𝜎(𝑊𝑜𝑥𝑡 + 𝑈𝑜ℎ𝑡−1 + 𝑏𝑜)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎ℎ𝑡−1 + 𝑏𝑎)

𝐶𝑡 = 𝑎𝑡⨀𝑖𝑡 + 𝑓𝑡⨀𝐶𝑡−1ℎ𝑡 = tanh(𝐶𝑡)⨀𝑜𝑡

Do not have the previous state information

𝑥𝑡

𝐶𝑡−1

𝜎 𝑡𝑎𝑛ℎ

×

𝜎

× +

𝜎

×

𝑡𝑎𝑛ℎ

ℎ𝑡−1

𝐶𝑡

ℎ𝑡

𝑓𝑡 𝑖𝑡 𝑎𝑡𝑜𝑡

Page 23: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

𝑥𝑡

𝐶𝑡−1

𝜎 𝑡𝑎𝑛ℎ

×

𝜎

× +

𝜎

×

𝑡𝑎𝑛ℎ

ℎ𝑡−1

𝐶𝑡

ℎ𝑡

𝑓𝑡 𝑖𝑡 𝑎𝑡𝑜𝑡

LSTM with Peephole Connection

𝑓𝑡 = 𝜎(𝑊𝑓𝑥𝑡 + 𝑈𝑓ℎ𝑡−1 + 𝐶𝑡−1𝑃𝑓 + 𝑏𝑓)

𝑖𝑡 = 𝜎(𝑊𝑖𝑥𝑡 + 𝑈𝑖ℎ𝑡−1 + 𝐶𝑡−1𝑃𝑖 + 𝑏𝑖)

𝑜𝑡 = 𝜎(𝑊𝑜𝑥𝑡 + 𝑈𝑜ℎ𝑡−1 + 𝐶𝑡𝑃𝑜 + 𝑏𝑜)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎ℎ𝑡−1 + 𝑏𝑎)

𝐶𝑡 = 𝑎𝑡⨀𝑖𝑡 + 𝑓𝑡⨀𝐶𝑡−1ℎ𝑡 = tanh(𝐶𝑡)⨀𝑜𝑡

Page 24: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

𝑧𝑡 = 𝜎(𝑊𝑧𝑥𝑡 + 𝑈𝑧𝐶𝑡−1 + 𝑏𝑧)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎(𝑟𝑡⨀𝐶𝑡−1) + 𝑏𝑎)

𝐶𝑡 = 1 − 𝑧𝑡 ⨀𝑎𝑡 + 𝑧𝑡⨀𝐶𝑡−1

𝑟𝑡 = 𝜎(𝑊𝑟𝑥𝑡 + 𝑈𝑟𝐶𝑡−1 + 𝑏𝑟)

𝐶𝑡−1

𝜎 𝜎

×

𝑡𝑎𝑛ℎ

+

𝑥𝑡

𝐶𝑡

𝑧𝑡 𝑟𝑡

×

1 − ×𝑎𝑡

Gated Recurrent Unit (GRU)

Page 25: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

𝑧𝑡 = 𝜎(𝑊𝑧𝑥𝑡 + 𝑈𝑧𝐶𝑡−1 + 𝑏𝑧)

𝑎𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑎𝑥𝑡 + 𝑈𝑎(𝑟𝑡⨀𝐶𝑡−1) + 𝑏𝑎)

𝐶𝑡 = 1 − 𝑧𝑡 ⨀𝑎𝑡 + 𝑧𝑡⨀𝐶𝑡−1

𝑟𝑡 = 𝜎(𝑊𝑟𝑥𝑡 + 𝑈𝑟𝐶𝑡−1 + 𝑏𝑟)

𝐶𝑡−1

𝜎 𝜎

×

𝑡𝑎𝑛ℎ

+

𝑥𝑡

𝐶𝑡

𝑧𝑡 𝑟𝑡

×

1 − ×𝑎𝑡

Gated Recurrent Unit (GRU)

𝐶𝑡 = 𝑖𝑡⨀𝑎𝑡 + 𝑓𝑡⨀𝐶𝑡−1 in LSTM

Page 26: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Hierarchical Architecture

Page 27: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Stage-wise Training

𝑓 𝑓 𝑓 ⋯𝑠1 𝑠2 𝑠3

𝑓𝑠𝑛

car

• Image classification

• Give subsampling image step by step

Sub Image 1

Sub Image 2

Page 28: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Stage-wise Training

𝑓 𝑓 𝑓 ⋯𝑠1 𝑠2 𝑠3

𝑓𝑠𝑛

car

• Image classification

• Give subsampling image step by step

Page 29: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Hierarchical Deep Convolution Neural Network• Image classification

• Give subsampling image step by step

Shared Layers

Coarse Component independent layers

Fine Component 1 independent layers

Fine Component 2 independent layers

Fine Component n independent layers

⋮Images

Pro

bab

ilist

ic a

vera

gin

g la

yer

Page 30: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Hierarchical Deep Convolution Neural Network• Image classification

• Give subsampling image step by step

Shared Layers

Building independent layers

Floor 1 independent layers

Floor 2 independent layers

Floor n independent layers

⋮Images

Pro

bab

ilist

ic a

vera

gin

g la

yer

Page 31: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Hierarchical Auxiliary Learning

Page 32: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Transportation

Animal

Straight lines

Has circle

Hierarchical Auxiliary Learning

Page 33: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Building A

Building B

Building C

Building D

Hierarchical Auxiliary LearningFloor 1

Floor 2

Floor 3

Floor 1

Floor 2

Floor 3

Floor 4

Floor 1

Floor 2

Floor 3

Floor 4

Floor 1

Floor 2

Page 34: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Dataset Case Semantics Superclass

MNIST Case1 ≥ 5 0:{5, 6, 7, 8, 9}, 1:{0, 1 ,2 ,3, 4}

SVHN Case2 Mod2 0:{1, 3, 5, 7, 9}, 1:{0, 2, 4, 6, 8}

Case3 Prime 0:{2, 3, 5, 7}, 1:{0, 1, 4, 6, 8, 9}

Case4 Circle/ curve/ straight line 0:{0, 6, 8, 9}, 1:{2, 3, 5}, 2:{1, 4, 7}

CIFAR-10 Case1 None 0:{5, 6, 7, 8, 9}, 1:{0, 1 ,2 ,3, 4}

Case2 None 0:{1, 3, 5, 7, 9}, 1:{0, 2, 4, 6, 8}

Case3 Transportation/ animal 0:{2, 3, 4, ,5, 6, 7}, 1:{0, 1, 8, 9}

Case4 Car/ small animal/ big animal/ craft/ others 0:{1, 9}, 1:{3, 5}, 2:{4, 7}, 3:{0, 8}, 4:{2, 6}

Dataset Class

MNIST, SVHN 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

CIFAR-10 Airplane, car, bird, cat, deep, dog, frog, horse, ship, truck

Page 35: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

MNIST baseline Case1 Case2 Case3 Case4

Error 0.93 0.43±0.03 0.73±0.06 0.70±0.05 0.69±0.00

SVHN Baseline Case1 Case2 Case3 Case4

Error 4.05 2.53±0.06 2.64±0.11 2.66±0.07 2.86±0.07

CIFAR-10 baseline Case1 Case2 Case3 Case4

Error 6.81 3.30±0.06 5.30±0.14 6.46±0.08 5.13±0.09

CIFAR-10-Case1 CIFAR-10-Case2 CIFAR-10-Case3 CIFAR-10-Case4

Page 36: PowerPoint Presentation · What machines do What is a fhere? e.g., • temperature, humidity, the amount of cloud -> precipitation • population, the number of shops, the number

Thank you