Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl...

34
1 © 2015 The MathWorks, Inc. Deep Learning for Computer Vision with MATLAB By Jon Cherrie

Transcript of Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl...

Page 1: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

1copy 2015 The MathWorks Inc

Deep Learning for Computer Vision with MATLAB

By Jon Cherrie

2

Deep learning is getting a lot of attention

Dahl and his colleagues won $22000 with a deep-

learning system We improved on Mercks baseline by

about 15

- Nature 2014

When Google adopted deep-learning-based speech

recognition in its Android smartphone operating system it

achieved a 25 reduction in word errors

- Nature 2014

[Baidus] system has achieved the best result to date

with a top-5 error rate of 458 and exceeding the human

recognition performanceldquo

- HPCWire 2015

3

Agenda

What is deep learning

Demo ndash object recognition

Challenges with deep learning

Why MATLAB

4

What is deep learning

5

Example Problem ndash Image Classification

Model

Tractor

Bicycle

6

Typical Computer Vision Model

MODELSUPERVISED

LEARNINGPREPROCESSING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

FEATURE

EXTRACTIONTRAINING

Support Vector

Machine

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 2: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

2

Deep learning is getting a lot of attention

Dahl and his colleagues won $22000 with a deep-

learning system We improved on Mercks baseline by

about 15

- Nature 2014

When Google adopted deep-learning-based speech

recognition in its Android smartphone operating system it

achieved a 25 reduction in word errors

- Nature 2014

[Baidus] system has achieved the best result to date

with a top-5 error rate of 458 and exceeding the human

recognition performanceldquo

- HPCWire 2015

3

Agenda

What is deep learning

Demo ndash object recognition

Challenges with deep learning

Why MATLAB

4

What is deep learning

5

Example Problem ndash Image Classification

Model

Tractor

Bicycle

6

Typical Computer Vision Model

MODELSUPERVISED

LEARNINGPREPROCESSING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

FEATURE

EXTRACTIONTRAINING

Support Vector

Machine

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 3: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

3

Agenda

What is deep learning

Demo ndash object recognition

Challenges with deep learning

Why MATLAB

4

What is deep learning

5

Example Problem ndash Image Classification

Model

Tractor

Bicycle

6

Typical Computer Vision Model

MODELSUPERVISED

LEARNINGPREPROCESSING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

FEATURE

EXTRACTIONTRAINING

Support Vector

Machine

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 4: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

4

What is deep learning

5

Example Problem ndash Image Classification

Model

Tractor

Bicycle

6

Typical Computer Vision Model

MODELSUPERVISED

LEARNINGPREPROCESSING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

FEATURE

EXTRACTIONTRAINING

Support Vector

Machine

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 5: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

5

Example Problem ndash Image Classification

Model

Tractor

Bicycle

6

Typical Computer Vision Model

MODELSUPERVISED

LEARNINGPREPROCESSING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

FEATURE

EXTRACTIONTRAINING

Support Vector

Machine

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 6: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

6

Typical Computer Vision Model

MODELSUPERVISED

LEARNINGPREPROCESSING

KMEANSAUTO-

ENCODER

PCAGMM CLASSIFICATION

REGRESSION

LOAD

DATA

FEATURE

EXTRACTIONTRAINING

Support Vector

Machine

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 7: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

7

Deep neural network

Some of these layers will be detecting ldquofeaturesrdquo

Other layers will do classification

All the layers are trained together

LayerMODELDATA

IMAGES Layer Layer LayerLayer

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 8: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

8

Deep learning asymp convolutional neural network

A convolutional neural network (ConvNet or CNN) is made up of different

types of layers

Convolution

Rectified linear unit (ReLU)

Pooling

Fully connected layers

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 9: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

9

Convolution

A convolutional layer operates on a three-dimensional array ie an image

with red green and blue channels

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 10: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

10

Convolutions tend to act as edge filters

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 11: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

11

ReLU

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 12: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

12

Pooling

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 13: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

13

Average Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

09 01

2 36

3 times 3

Average

Pooling

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 14: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

14

Max Pooling

0 1 1 1 0 0

0 2 1 0 1 0

2 1 0 0 0 0

2 2 1 4 3 4

2 3 2 3 4 4

1 2 3 4 3 3

2 1

3 4

3 times 3

Max

Pooling

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 15: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

15

Fully connected layers

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 16: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

16

Layers

Convolution

ReLU

Pooling

Fully connected

Softmax

Local response normalization

hellip

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 17: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

17

A deep network might be hellip

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 18: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

18

Demo ndash Object Recognition

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 19: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

19

Training

Trained to perform classification on the ImageNet ILSVRC challenge data

ndash 12 million images of varying size cropped to 224x224

ndash Each image falls into one of 1000 categories

Training takes approximately a week

ndash This demo doesnrsquot show training

We will use a pre-trained network vgg-f

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 20: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

20

Object Recognition using Deep Learning

Training

(using GPU)Millions of images from 1000 different categories

Prediction Real-time object recognition using a webcam connected to a

laptop

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 21: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

21

Challenges with deep learning

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 22: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

22

Large number of

parameters to find

Layer Details Output Size Number of Parameters

Input 224x224x3

Conv 1 64 filters 11x11Stride 4 Pad 0

54x54x64 6411113 = 23232

LRN 54x54x64

Max Pool x2 downsample 27x27x64

Conv 2 256 5x5Stride 1 Pad 1

25x25x256 2565564 = 409600

LRN 25x25x256

Max Pool x2 downsample 12x12x256

Conv 3 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 4 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Conv 5 256 3x3Stride 1 Pad 1

12x12x256 25633256 = 589824

Max Pool x2 downsample 6x6x256

Full Connect 6 4096 4096x1 662564096 = 37748736 (38 million)

Dropout 4096x1

Full connect 7 4096 4096x1 40964096 = 16777216 (168 million)

Dropout 4096x1

Full connect 8 1000 1000x1 40961000 = 4096000 (4 million)

Softmax 1000x1

TOTAL 61 million

61 million parameters to

find by training on data

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 23: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

23

Need many images in training set

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 24: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

24

Tools for pre- and post-processing

Also

removing average

distortions eg

rotation amp flips

etcResizingCropping

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 25: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

25

Iterative design

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 26: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

26

Why MATLAB for deep learning

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 27: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

27

Why MATLAB for Deep Learning

Ability to work with signal images financial geospatial etc data

Library of algorithms for image signal and computer vision

Built-in GPU support for functions such as image rotation convolution

transformation and filtering

Visualization

Lots of community packages eg MatConvNet Caffe deep learning

toolbox in File Exchange

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 28: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

28

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 29: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

29

Start with a pre-trained network

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 30: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

30

Managing image datasets

imageSet (new in R2014b)

Automated file-based workflow

ndash Labelling

ndash Partition

ndash Reading

ndash Indexing

Integrated in Computer

Vision workflows

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 31: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

31

Image Acquisition Toolbox

Support for

ndash industry standards including

DCAM Camera Link

and GigE Vision

ndash Common OS interfaces for webcams including Direct Show QuickTime

and video4linux2

ndash A range of industrial and scientific hardware vendors

ndash Microsoft Kinect

Built-in MATLAB support for

ndash Webcams

ndash IP Cameras

Supported hardware

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 32: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

32

How MATLAB addresses challenges

Large sets of images that dont fit in memory imageSet

Image Processing and Computer Vision tools for pre- and post-processing

Long running training built-in GPU support for over 200 MATLAB functions

45 Image Processing function 90 Statistics and Machine Learning

functions etc

MATLAB offers flexible architecture for customized workflows

Community toolboxes for ConvNets

33

FIN

34

FIN

Page 33: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

33

FIN

34

FIN

Page 34: Deep Learning for Computer Vision with MATLAB · Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deep-learning system. 'We improved on Merck's

34

FIN