Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN....

35
Accelerating Deep Learning with the Biggest Losers Angela H. Jiang, Daniel L.-K. Wong, Giulio Zhou, David G. Andersen, Jeffrey Dean, Gregory R. Ganger, Gauri Joshi, Michael Kaminsky, Michael A. Kozuch, Zachary C. Lipton, Padmanabhan Pillai

Transcript of Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN....

Page 1: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Accelerating Deep Learning with the Biggest Losers

Angela H. Jiang, Daniel L.-K. Wong, Giulio Zhou,

David G. Andersen, Jeffrey Dean, Gregory R. Ganger,

Gauri Joshi, Michael Kaminsky, Michael A. Kozuch,

Zachary C. Lipton, Padmanabhan Pillai

Page 2: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Deep learning enables emerging applications

Page 3: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training analyzes many examples

Page 4: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Selective-Backprop prioritizes informative examples

Page 5: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN basicsHow to use and train a DNN

Page 6: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Example task: Image classification

Dog

Lizard

Goat

Bird

Page 7: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

.1

.2

.6

.1

DNN inference: From image to “Dog”

Dog

Lizard

Goat

Bird

Page 8: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Training DNNs relies on a labeled dataset

Class: Dog

Class: Bird

Class: Lizard

Class: Dog

Page 9: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training: Determining the weights

Class1

Page 10: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training: Determining the weights via backpropagation

Class 3

Page 11: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training: Determining the weights via backpropagation

Class 2

Page 12: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training analyzes an example many times

Epoch 1Epoch 2

Page 13: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training analyzes an example many times

Epoch 250

Page 14: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

SelectiveBackprop targets slowest part of training

Page 15: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Not all examples are equally useful

Page 16: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Prioritize examples with high loss

Examples with low loss Examples with high loss

Page 17: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Selective Backprop algorithm

Page 18: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

DNN training analyzes an example many times

Page 19: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Bad idea #1:Deciding with a hard threshold

if loss > threshold: backprop()

Page 20: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Bad idea #2:Deciding probabilistically with absolute loss

P(backprop) = normalize(loss, 0, 1)

Page 21: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Good idea:Use relative probabilistic calculation

P(backprop) = Percentile(loss, recent losses)B

Page 22: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Loss = 2.3

CDF(loss) = 0.9

CDF(loss) =0.9

Probability = .96

Example of probability calculation

Page 23: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Selective-Backprop approach

Decide probabilisticallyif we should backprop

Forward propagate examplethrough the network

Calculate usefulness of backpropping example based on its accuracy

Page 24: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

StaleSB reduces forward passes

Forward propagate examplethrough the networkevery n epochs

Calculate usefulness of backpropping example based on its accuracy

Decide probabilisticallyif we should backprop

Page 25: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Evaluation of Selective Backprop

Page 26: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

CIFAR1060,000 Training Images

SVHN604,388 Training Images

CIFAR10060,000 Training Images

Datasets

Page 27: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Train CIFAR10 to 4.14% (1.4x Traditional’s final error)

SB: 1.5X faster

StaleSB: 2X faster

Page 28: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Train CIFAR100 to 25.5% (1.4x Traditional’s final error)

SB: 1.2X faster

StaleSB: 1.6X faster

Page 29: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Train SVHN to 1.72% (1.4x Traditional’s final error)

SB: 3.5X faster

StaleSB: 5X faster

Page 30: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

= [0.1, 0.3, 0.6]

= [0, 1, 0]

= 0.3

3% correct w/ Traditional

29% correct w/ SB

SB on CIFAR10 targets hard examples

Page 31: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Selective-Backprop accelerates training✓

SelectiveBackprop outperforms static approaches✓Trains up to 3.5x faster compared to standard SGD

Trains 1.02-1.8X faster than state-of-the-art importance sampling approach

Stale-SB further accelerates training✓Trains on average 26% faster compared to SB

www.github.com/angelajiang/SelectiveBackprop

Reduces time spent in the backwards pass by prioritizing high-loss examples

Page 32: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Compared approaches

Traditional

Katharopoulos18

Selective-Backprop(Us)

Classic SGD with no filtering

State of the art importance sampling approach

RandomRandom sampling approach

Page 33: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

SVHN

Given 40 hours to train,

Stale-SB provides lowest error

Page 34: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

Most Pareto optimal points are SB or StaleSB

CIFAR10 CIFAR100

Given unlimited time to train,

Traditional provides lowest error

Page 35: Accelerating Deep Learning with the Biggest Losers · DNN basics How to use and train a DNN. Example task: Image classification Dog Lizard Goat Bird.1.2.6.1 DNN inference: From image

SB is robust to modest amounts of error

0.1% Randomized 10% Randomized 20% Randomized